General

Idiomatic coding

A code snippet is said idiomatic if it uses its programming language the way the language is intended to be used.

Programming using only idioms ensures you a certain level of :

performance: all the benchmarks are by default based on idiomatic ways to perform tasks using the language.
readability: your code will be more easily intelligible by the rest of the community because the set of idioms represents a standard.

Java

`&&` and `||` vs `&` and `|`

When the boolean operator is doubled, it means that its evaluation will be lazy:

Given that a evaluates to false: a && b will not trigger the evaluation of b but a & b will.
Given that a evaluates to trur: a || b will not trigger the evaluation of b but a | b will.

Modifiers order

from open jdk guidelines:

public / private / protected
abstract
static
final
transient
volatile
**default**
synchronized
native
strictfp

`-Xms` & `-Xmx`

Passing the following arguments to the JVM:

java -Xms256m -Xmx2048m

will reserve an initial amount of memory of 256 MB for JVM’s heap consumption and allow it to use up to 2048 MB during its runtime.

Java 8: Interfaces

We will take the following interface as an example for this section:

@FunctionalInterface
interface Square{
  String NAME = "greatSquare";
  int calculate(int x);
  default int calculateFromString(String s){
    return this.calculate(Integer.parseInt(s));
  }
  static void printName(){
    System.out.println("I'm Square interface named" + Square.NAME);
  }
}

functional interface

Square definition allows to get a concrete implementation of Square interface by implementing its calculate abstract method like follow:

Square sq = (int x) -> x*x;

In fact @FunctionalInterface is here to guarantee that the functional interface has only one abstract method.

Making a lambda Serializable

At instantiation time, you can use a union cast to make your lambda serializable (Serializable is an empty interface):

(Runnable & Serializable) () -> System.out.println("runnable is running")

`default` method in interface vs abstract method in abstract classes ?

(post)

When doubting, use interface with defaults method in priority, because it has more constraints leading to more compiler optimizations
Interfaces with default methods can be used as block of behaviors that can enrich a class, because one can only extend one (abstract) class but implement as many interfaces as needed ! Usage can be made in the style of Scala Mixins:

interface Printer {  
  default void print(String content) {  
    System.out.println(content);  
  }  
}  
  
interface Named {  
  default String getName() {  
    return this.toString();  
  }  
}  
  
class NamePrinter implements Named, Printer {  
  void printName() {  
    this.print(this.getName());  
  }  
}

`static` methods

Unlike for (abstract) classes, static method in interfaces cannot be accessed through instances:

sq.printName(); does not compile
Square.printName(); compiles

interface attributes

Interface attributes cannot receive any modifier and are by default public static final. Like for methods, staticity in interface attribute implies that it cannot be reached from an instance:

sq.NAME; does not compile
Square.NAME; compiles

`++`

Given counter a numeric primitive (int, …) or an instance of a corresponding boxed type (Integer, …), the statement

return ++counter;

is a syntax sugar for

counter += 1;
return counter;

and

return counter++;

is a syntax sugar for

counter += 1;
return counter - 1;

Getting objects names `getName` vs `getCanonicalName` vs `getSimpleName` vs `getTypeName`

Source: SO post by Nick Holts

||getName | getCanonicalName | getSimpleName | getTypeName| |–|–|–|–|–| |int.class (primitive)|int|int|int|int| |String.class (ordinary class)|java.lang.String|java.lang.String|String|java.lang.String| |java.util.HashMap.SimpleEntry.class (nested class)|java.util.AbstractMap$SimpleEntry|java.util.AbstractMap.SimpleEntry|SimpleEntry|java.util.AbstractMap$SimpleEntry| |new java.io.Serializable(){}.getClass() (anonymous inner class)|ClassNameTest$1|null|``|ClassNameTest$1|

int.class (primitive):
    getName():          int
    getCanonicalName(): int
    getSimpleName():    int
    getTypeName():      int

String.class (ordinary class):
    getName():          java.lang.String
    getCanonicalName(): java.lang.String
    getSimpleName():    String
    getTypeName():      java.lang.String

java.util.HashMap.SimpleEntry.class (nested class):
    getName():          java.util.AbstractMap$SimpleEntry
    getCanonicalName(): java.util.AbstractMap.SimpleEntry
    getSimpleName():    SimpleEntry
    getTypeName():      java.util.AbstractMap$SimpleEntry

new java.io.Serializable(){}.getClass() (anonymous inner class):
    getName():          ClassNameTest$1
    getCanonicalName(): null
    getSimpleName():    
    getTypeName():      ClassNameTest$1

Diamond problem

Scala

Constructors parameters scope

(case) class C(... a: Int)

Visibility	Accessor?	Mutator?
var	Yes	Yes
val	Yes	No
Default visibility (no var or val)	No for classes, Yes for case classes	No
Adding the private keyword to var or val	No	No

Equals

== is infix alias for equals(other: Any): Boolean method

val g = 1
val f = new AnyRef {
 override def equals(other: Any) = true
}
f.equals(g)   true: Boolean
f==g          true: Boolean
g==f          false: Boolean

`Anyval`’s `equals`

This is a special case where == and equals are not equivalent: the first cast types before performing equality check.

'a' == 97  // true
97L == 'a'  // true
'a'.equals(97)  // false
97L.equals('a')  // false

Closures

This is a closure: The variable a in f in captured by f and the modifications of a are influencing f executions.

var a = new StringBuilder
a.append("a")
val f = ()=>println(a)
f()
a = new StringBuilder
a.append("b")
f()

Result:

a
b

This is not a closure: i is copied to g scope and the closure it returns contains only a copy of the StringBuilder, so modifying the original StringBuilder variable from the outer scope doesn’t influence closure returned by g.

var a = new StringBuilder
a.append("a")
val g = (i:StringBuilder)=> (()=>{println(i)})
val f = g(a)
f()
a = new StringBuilder
a.append("b")
f()

Result:

a
a

BUT: as a is an AnyRef, the copy is just a copy of the reference and thus, if the object itself (not the variable a) is modified, it will influence the execution of the closure returned by g.

var a = new StringBuilder
a.append("a")
val g = (i:StringBuilder)=> (()=>{println(i)})
val f = g(a)
f()
val b = a
a = new StringBuilder
a.append("b")
f()
b.append("c")
f()

Result:

a
a
ac

CONCLUSION:

AnyVal passed to function as argument x: x is a val containing the value (in bytecode it’s a primitive) of the AnyVal object.
AnyRef passed to function as argument x: x is a val containing the reference to the AnyRef object.
AnyVal or AnyRef catched by a closure through variable a: the closure is directly linked to the variable of the outer scope if it is reachable.

NOTE: for the last case, if the variable is disapearing from scope, last reference contained in the variable is used (this last reference is not a copy of the object and can be operated on by another owner of the same ref):

var b = new StringBuilder()
val f = () =>
{
  var a = new StringBuilder().append("1")
  val g = () => println(a)
  a = new StringBuilder().append("2")
  g()  // 2
  a = new StringBuilder().append("3")
  g()  // 3
  b = a
  g
}

val gReturned = f()
gReturned()  // 3
b.append("4")
gReturned()  // 34

Returns

(Idem for Java, Python) String equality

String literals value equality is the same as reference equality because string literals are interned by compiler for memory (shared space) and comparison time (need only location comparison instead of comparing every char) optimisation

Implicit conversions

[…]Expanding on Kevin’s answer and explaining why it’s not possibly for scaladoc to tell you what implicit conversion exists: implicit conversions only come into play when your code would not compile otherwise. You can see it as an error recovery mechanism for type errors that is activated during compilation.

Partial application

val f = (x: Int, n: Int) => Math.pow(x, n).toInt
val square = f(_: Int, 2)
square(3)

Pass collection as arguments in variable numbes:

object A{
	def b(s: String*) = s.foreach(println)
}
// Unpacking Transversable into separated args:
A.b(Seq("a", "b", "c"): _*)

Blocks

val a = {println(5); i: Int => println(i); i*2}

a is the result of the evaluation of the block = a function typed as Int => Int and that is impure because it prints its input i: Int
This line execution will only print ‘5’
a(1) only prints 1 and returns 2

Block is evaluated even if directly passed to a function inputing another function:

object Test extends App{
  def b(f: Int => Unit): Unit = {  
    println("start")  
    f(1)  
    println("end")  
  }  
  b({println(5); i: Int => println(i)})
}

>>> 5
>>> start
>>> 1
>>> end

To delay this evaluation, we use lazy evaluation with calls-by-name parameters:

object Test extends App{
  def b(f: => Int => Unit): Unit = {  
    println("start")  
    f(1)  
    println("end")  
  }  
  b({println(5); i: Int => println(i)})
}

>>> start
>>> 5
>>> 1
>>> end

Final class members

Difference between val const and final val const ? final members cannot be overridden, say, in a sub-class or trait.

isInstanceOf vs isInstance

For reference types (those that extend AnyRef), there is no difference in the end result. isInstanceOf is however much encouraged, because it’s more idiomatic (and likely much more efficient).

For primitive value types, such as numbers and booleans, there is a difference:

scala> val x: Any = 5
x: Any = 5

scala> x.isInstanceOf[Int]
res0: Boolean = true

scala> classOf[Int].isInstance(x)
res1: Boolean = false

That is because primitive value types are boxed when upcasted to Any (or AnyVal). For the JVM, they appear as java.lang.Integers, not as Ints anymore. The isInstanceOf knows about this and will do the right thing.

In general, there is no reason to use classOf[T].isInstance(x) if you know right then what T is. If you have a generic type, and you want to “pass around” a way to test whether a value is of that class (would pass the isInstanceOf test), you can use a scala.reflect.ClassTag instead. ClassTags are a bit like Classes, except they know about Scala boxing and a few other things. (I can elaborate on ClassTags if required, but you should find info about them elsewhere.)

Self-Type

case class Word(letters: String)

trait Splittable {
  this: Word =>  
  // all the `Word` trait or class attributes and methods are accessible 
  // through `this` even if `Splittable` does not iherit from `Word`
  def split(by: String): Seq[String] = this.letters.split(by).toSeq
}

val word = new Word("Hello World!") with Splittable

println(word.split(" "))  // prints "ArraySeq(Hello, World!)"

Here the trait Splittable is forced to be mixed with Word. Splittable’s purpose is reduced to the extension of a Word’s behavior. Allows to implement a nice Decorator Pattern in a functional style.

Python

`threading`vs `multiprocessing` modules

threading module allows to spawn threads in Python script. It let’s you perform concurrent execution of your tasks. One BIG limitation compared to C or Java threads is that the GIL (Global Interpreter Lock) only allows one thread at a time to be executed by the Python interpreter, making impossible to do parallel processing with threads (i.e. execute threads not only concurrently but in parallel). This make this library irrelevant for CPU intensive threads but it is still good enough for IO intensive threads.
multiprocessing module on the other hand allows parallel processing by running tasks in separate processes, each one tight with its own python interpreter. Note that as your tasks are different processes they cannot share heap’s memory like threads do.

-> A limitation of Python is that you cannot have in the same time your tasks both executing in parallel and sharing memory.

`threading` lib

Critical section

ex: accessing or updating a shared state

import threading
class CustomThread(threading.Thread):
	# static class field
	shared_lock = threading.Lock()
    
	def run(self):
    	# do not critical things
        
        with self.shared_lock:
        	# lock taken, do critical things
        
        # lock released, do not critical things

for-else, break, continue

break: end loop

for …: … else:… : else block is only executed if break not called

continue: ignore rest of the loop’s block and jump to next loop iteration

repr, str

In notebook cell

class A:
    def __repr__(self):
        return "repr"
#   def __str__(self):
#       return "str"
print(A())
A()

outs:

repr
repr

class A:
    def __repr__(self):
        return "repr"
    def __str__(self):
        return "str"
print(A())
A()

outs:

str
repr

–> str is dominant over repr for prints, but cell always output last expression repr

Usefull imports for OOP

from overrides import overrides  # decorator '@overrides' 
from abc import ABC, abstractmethod  #  'class C(ABC)' is abstract and decorator '@abstractmethod' usable.

Ada

Bindings to Java concepts

mainly from Adacore’s Ada for C++ and Java developper _____

Imports

Java:

// unecessary
import java.lang.System;

Ada:

-- `use` allows to use package elements without namespace
with Ada.Text_IO; -- use Ada.Text_IO;

Program entry point

Java:

public SomeClass {
  public static void main(String[] args) {
    ...
  }
}

Ada:

procedure Some_Procedure is
begin
  ...
end Some_Procedure;

Print

Java:

System.out.println("hello");

Ada:

-- Using `with` and `use` of `Ada.Text_IO.` package
Put_Line("hello");

Print

Java:

System.out.println("hello");

Ada:

-- Using `with` and `use` of `Ada.Text_IO.` package
Put_Line("hello");

Declarations

Java:

...
  int outerScopeField1
  int outerScopeField2
  void proc(int arg) {
    int localVar1 = arg * outerScopeField2;
    int localVar3 = arg * localVar1;
    int localVar2 = 2;
    this.outerScopeField2 = localVar1 * localVar2;
    this.outerScopeField1 = arg;
  }

Ada:

procedure Proc 
 (Arg : Integer;
  Outer_Scope_Field_1 : out Integer;
  Outer_Scope_Field_2 : in out Integer)
is
  Local_Var_1, Local_Var_2 : Integer;
  Local_Var_3 : Integer := 2;
begin
  Local_Var_1 := Arg * Outer_Scope_Field_2;
  Local_Var_3 := Arg * Local_Var_1;
  Outer_Scope_Field_2 := Local_Var_1 * Local_Var_2;
  Outer_Scope_Field_1 := Arg;
end Proc;

Condition

Java:

if (a > 0) {
  ...;
  ...;
} else if (a < 0) {
  ...;
  ...;
} else {
  ...;
  ...;
}

Ada:

if A > 0 then
  ...;
  ...;
elsif A < 0 then
  ...;
  ...;
else
  ...;
  ...;
end if;

Switching

Java:

switch (var) {
  case 0: case 1: case 2:
    ...;
    break;
  default:
    ...;
}

Ada:

case Var is
  when 0 .. 2 =>
    ...;
  when others =>
    ...;
end case;

Loops

Java:

while (var > 0) {
  ...;
}

for (int i = N; i >= 0, i--) {
  ...;
}

for (int i : intArray) {
  ...;
}

Ada:

while Var > 0 loop
  ...;
end loop;

for i in reverse 0 .. N loop
  ...;
end loop;

for i of Int_Array loop
  ...;
end loop;

Ada’s trong typing vs Java’s implicit conversions

Java:

double a = 1 / 3;
double b = 1 / (double)3;

Ada:

...  -- declarations
  A : Float;
  B : Float;
is
  A := Float (1 / 3);
  B := Float (1) / Float (3);
  ...

Class/Type

Compilation unit (Package / .java file)

C

variable, pointer types, heap, stack

// Let's create a variable: an allocated memory range on the stack,
// sized to hold the data of a certain type, here `int` (4 bytes):
int a; 
// at this point `a` has an allocated memory range but it is uninitialized: 
// its memory range contains the trash that was written here previously.

// let's create another variable of type `int*` this time. 
// It will be mapped to an allocated memory range on the stack, 
// that can hold a memory address (64 or 32 bits) 
// which is intended to point to an `int`'s data (4 bytes).
int *p;
// at this point if we want to modify the 4 bytes of stack's data pointed
// by the memory address held by the variable `p` using 
// `*p = ...` we will end up with a segfault: core dump.
// It is because `p` is uninitialized and the memory address it holds is trash,
// and most of the time out of our current allowed memory space.

// variable `p` initialisation: let's put into the  memory range of the
// variable `p` the address of the memory range of `a`:
p = &a;

// now let's initialise the `int` data both held inside variable `a` and pointed by the memory address held inside variable `p`, using two differetn ways:
a = 1;
// at this point a == 1 and *p == 1
*p = 2;
// at this point a == 2 and *p == 2


// let's make variable `trash` hold memory range 
// on the STACK, that can contains the data of 2 `int`s 
// (8 bytes, not initialized):
int trash[2];  

// let's make variable `p` hold a memory address of 
// the allocated memory range hold by variable `trash` on the STACK.
p = trash;
// with initialization:
int ones[2] = {1, 1};  
p = ones;

// let's make variable `p` hold a memory address of 
// an allocated a memory range on the HEAP, 
// that can contains the data of 2 `int`s (8 bytes, not initialized):
p = (int *)malloc(2 * sizeof(int));

// let's put the value held by the variable `a`
// inside the second `int` slot of the memory range pointed
// by the address held by variable `p`.
// In two different ways:
*(p + 1) = a;
p[1] = a;

// let's make variable `p` hold a memory address of 
// an allocated a memory range on the STACK 
// (memory range hold by variable `trash`), 
// that can contains the data of 6 `int`s (24 bytes, not initialized):
int trash[2][3];  
p = trash;

// let's put the value held by the variable `a`
// inside the last `int` slot of the memory range pointed
// by the address held by variable `p`.
// In 3 different ways:
*(p + 5) = a;
p[5] = a;
trash[1][2] = a;

SQL

SELECT struct replacing only one field in it

SELECT
  ...
  , SELECT AS STRUCT struct_name.* REPLACE(new_value as struct_field)
FROM ...

Get a sample of 10 values from a group

SELECT ARRAY_AGG(field LIMIT 10) FROM ...

Only distinct values:

SELECT ARRAY_AGG(DISTINCT field LIMIT 10) FROM ...

LEFT SEMI JOIN

The result table of A LEFT SEMI JOIN B is a subset of A’s records (only A’s fields) whose key is matching at least 1 record in B.

Recursive CTE

WITH RECURSIVE cte AS (
	SELECT /*...*/ -- anchor statement
	UNION ALL
	SELECT
		/*...*/
	FROM cte
		/*...*/
	WHERE /*...*/
)

Let’s call

  SELECT
		/*...*/
	FROM cte
		/*...*/
	WHERE /*...*/

the recursive statement.

What the engine does:

It will first evaluate SELECT /*...*/ -- anchor statement -> cte0
Then the recursive statement is evaluated with cte0 as cte and produce cte1
Then the recursive statement is evaluated with cte1 as cte and produce cte2 …

continues as long as a recursive statement evaluation returns empty result.

In the end, cte contains the union of the anchor statement and all the recursive statement evaluations.

cte_max_recursion_depth hint syntax

WITH RECURSIVE cte AS (
	/*...*/
)
SELECT /*+ SET_VAR(cte_max_recursion_depth = 1M) */
	*
FROM cte

General

Idiomatic coding

Java

&& and || vs & and |

Modifiers order

-Xms & -Xmx

Java 8: Interfaces

functional interface

Making a lambda Serializable

default method in interface vs abstract method in abstract classes ?

static methods

interface attributes

++

Getting objects names getName vs getCanonicalName vs getSimpleName vs getTypeName

Diamond problem

Scala

Constructors parameters scope

Equals

Anyval’s equals

Closures

(Idem for Java, Python) String equality

Implicit conversions

Partial application

Pass collection as arguments in variable numbes:

Blocks

Final class members

isInstanceOf vs isInstance

Self-Type

Python

threadingvs multiprocessing modules

threading lib

Critical section

for-else, break, continue

repr, str

Usefull imports for OOP

Ada

Bindings to Java concepts

Imports

Program entry point

Print

Print

Declarations

Condition

Switching

Loops

Ada’s trong typing vs Java’s implicit conversions

Class/Type

Compilation unit (Package / .java file)

C

variable, pointer types, heap, stack

SQL

SELECT struct replacing only one field in it

Get a sample of 10 values from a group

LEFT SEMI JOIN

Recursive CTE

cte_max_recursion_depth hint syntax

`&&` and `||` vs `&` and `|`

`-Xms` & `-Xmx`

`default` method in interface vs abstract method in abstract classes ?

`static` methods

`++`

Getting objects names `getName` vs `getCanonicalName` vs `getSimpleName` vs `getTypeName`

`Anyval`’s `equals`

`threading`vs `multiprocessing` modules

`threading` lib