Paweł Szulc

ready to expect the unexpected

Paweł Szulc is a seasoned developer with a rich background in functional programming languages, including Scala and Haskell. Known in the developer community as "EncodePanda," Paweł is passionate about exploring complex software design patterns and bridging the gap between practicality and expressiveness in code.

Recently, Paweł has expanded his expertise to include Rust, intrigued by its powerful system-level programming capabilities and the language’s emphasis on memory safety without sacrificing performance. His transition to Rust aligns well with his interest in leveraging expressive type systems to write robust, efficient software.

"Apache Spark™ is a fast and general engine for large-scale data processing."" Above statement is taken from Apache Spark welcome page. It's one of those definitions that, while describing the product in one sentence and being 100 % true, tell still little to the wondering noob.

Why take interest in Apache Spark? Apache Spark promise being up to 100x faster than Hadoop MapReduce in certain scenarios. It provide comprehensible programming model (familiar to everyone who is used to functional programming) and vast ecosystem of tools.

In my talk I will try to reveal secrets of Apache Spark for the very beginners.

We will do first quick introduction to the set of problems commonly known as BigData: what they try to solve, what are their obstacles and challenges and how those can be addressed. We will quickly take a pick on MapReduce: theory and implementation. We will then move to Apache Spark. We will see what was the main factor that drove its creators to introduce yet another large-scala processing engine. We will see how it works, what are its main advantages. Presentation will be mix of slides and code examples.

Slides
Video ←Back

Paweł Szulc

Apache Spark 101

Contact