python - Referencing columns in Pyspark DataFrame -


suppose have word list converted data frame

  ----- | word |   ----- | cat  | | bird | | dog  | | ...  |   ----- 

and tried letter count:

from pyspark.sql.functions import length  letter_count_df = words_df.select(length(words_df.word)) 

i know results dataframe single column only.

how refer column of letter_count_df without using alias?

  ------------- | length(word) |   ------------- |           3  | |           4  | |           3  | |         ...  |   ------------- 

with name:

>>> letter_count_df.select(c) dataframe[length(word): int] 

or col , name:

>>> pyspark.sql.functions import * >>> letter_count_df.select(c)) 

with c being constant:

>>> c = "length(word)" 

or

>>> c = letter_count_df.columns[0] 

Comments

Popular posts from this blog

Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12:test (default-test) on project.Error occurred in starting fork -

windows - Debug iNetMgr.exe unhandle exception System.Management.Automation.CmdletInvocationException -

configurationsection - activeMq-5.13.3 setup configurations for wildfly 10.0.0 -