python - Referencing columns in Pyspark DataFrame -
suppose have word list converted data frame
----- | word | ----- | cat | | bird | | dog | | ... | ----- and tried letter count:
from pyspark.sql.functions import length letter_count_df = words_df.select(length(words_df.word)) i know results dataframe single column only.
how refer column of letter_count_df without using alias?
------------- | length(word) | ------------- | 3 | | 4 | | 3 | | ... | -------------
with name:
>>> letter_count_df.select(c) dataframe[length(word): int] or col , name:
>>> pyspark.sql.functions import * >>> letter_count_df.select(c)) with c being constant:
>>> c = "length(word)" or
>>> c = letter_count_df.columns[0]
Comments
Post a Comment