This function extracts cascades from a given jsonl file where each line is a tweet json object. Please refer to the Twitter developer documentation: https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object

parse_raw_tweets_to_cascades(
  path,
  batch = 1e+05,
  cores = 1,
  output_path = NULL,
  keep_user = F,
  keep_absolute_time = F,
  progress = T,
  return_as_list = T,
  save_temp = F
)

Arguments

path	File path to the tweets jsonl file
batch	Number of tweets to be read for processing at each iteration, choose the best number for your memory load. Defaults to at most 10000 tweets each iteration.
cores	Number of cores to be used for processing each batch in parallel.
output_path	If provided, the index.csv and data.csv files which define the cascaddes will be generated. In index.csv, each row is a cascade where events can be obtained from data.csv by corresponding indics (start_ind to end_ind). Defaults to NULL.
keep_user	Twitter user ids will be kept
keep_absolute_time	Keep the absolute tweeting times
progress	The progress will be reported if set to True (default)
return_as_list	If true then a list of cascades (data.frames) will be returned.

Value

If return_as_list is TRUE then a list of data.frames where each data.frame is a retweet cascade. Otherwise there will be no return.