Home

Works × Notes × LinkedIn × StackOverflow × Github

1. Introduction to problematic
1.1. Spark overview
   1.1.1. Spark Core
   1.1.2. Spark SQL
1.2. Definition of the problematic
   1.2.1. Determinism
   1.2.2. Problematic
2. Tour of the approaches
2.1. Pure SQL
2.2. Catalyst rules tuning
2.3. Common Sub-expression Elimination (CSE)
   2.3.1. Digression: Javac, JIT and CSE
   2.3.2. Catalyst's Codegen
2.4. Memoization
   2.4.1. Equality
   2.4.2. Space management and eviction policies
3. Presented solution: flexible-memoization
3.1. Goals definition
3.2. Usage overview
   3.2.1. Build
   3.2.2. Hello example
   3.2.3. Recursive examples
3.3. Structure
   3.3.1. UML
   3.3.2. Core abstractions
   3.3.3. Provided implementation
3.4. Answering the problematic
4. Conclusion
5. References