How to filter documents in a tm corpus in R based on metadata? -


i using r tm package , trying select documents index , metadata:

orbit_corpus<-corpus( tm_corpus, readercontrol = list(reader=myreader))  meta(my_corpus[[1]])  author  : a8 origin  : department  heading : whib id      : 1 year    : 2013 

i find documents within first hundred documents of corpus have been published in 2013. works identify whether metadata 'year' document 1 2013.

meta(my_corpus[[1]],"year") == 2013 [1] true 

i need gives me option find among first 100 indexes, meet criterion. imagine similar (but not work , unfortunately not generate list of documents).

meta(orbit_corpus[[1:100]],"year") == 2013 error in x$content[[i]] : recursive indexing failed @ level 4 

many help!

you use tm_filter on first 100 documents of corpus (orbit_corpus[1:100])

tm_filter(orbit_corpus[1:100], fun = function(x) meta(x)[["year"]] == "2013") 

from documentation

tm_filter returns corpus containing documents fun matches


Comments

Popular posts from this blog

Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12:test (default-test) on project.Error occurred in starting fork -

windows - Debug iNetMgr.exe unhandle exception System.Management.Automation.CmdletInvocationException -

configurationsection - activeMq-5.13.3 setup configurations for wildfly 10.0.0 -