pyspark.pandas.
range
Create a DataFrame with some range of numbers.
The resulting DataFrame has a single int64 column named id, containing elements in a range from start to end (exclusive) with step value step. If only the first parameter (i.e. start) is specified, we treat it as the end value with the start value being 0.
start
end
step
This is like the range function in SparkSession and is used primarily for testing.
the start value (inclusive)
the end value (exclusive)
the incremental step
the number of partitions of the DataFrame
Examples
When the first parameter is specified, we generate a range of values up till that number.
>>> ps.range(5) id 0 0 1 1 2 2 3 3 4 4
When start, end, and step are specified:
>>> ps.range(start = 100, end = 200, step = 20) id 0 100 1 120 2 140 3 160 4 180