pyspark.RDD.reduceByKeyLocally¶
- 
RDD.reduceByKeyLocally(func: Callable[[V, V], V]) → Dict[K, V][source]¶
- Merge the values for each key using an associative and commutative reduce function, but return the results immediately to the master as a dictionary. - This will also perform the merging locally on each mapper before sending results to a reducer, similarly to a “combiner” in MapReduce. - New in version 0.7.0. - Parameters
- funcfunction
- the reduce function 
 
- Returns
- dict
- a dict containing the keys and the aggregated result for each key 
 
 - See also - Examples - >>> from operator import add >>> rdd = sc.parallelize([("a", 1), ("b", 1), ("a", 1)]) >>> sorted(rdd.reduceByKeyLocally(add).items()) [('a', 2), ('b', 1)]