OpenCPU

Andreas Blätte (andreas.blaette@uni-due.de)

2019-12-17

Objective

Sometimes, it is practically or legally not possible to move corpus data to a local machine. This vignette explains the usage of CWB corpora that are hosted on an OpenCPU server.

library(polmineR)

Remote Corpus

The GermaParl corpus is hosted on an OpenCPU server with the IP 52.24.44.232 (subject to change). To use the corpus, use the corpus()-method. The only difference is that you will need to supply the IP address using the argument server.

gparl <- corpus("GERMAPARL", server = "52.24.44.232")

The gparl object is an object of class remote_corpus.

is(gparl)

Using polmineR core functionality

The polmineR at this stage exposes a limited set of its functionality for remote corpora. Simple investigations in the remote corpus are possible.

Get corpus size

Get structural annotation (metadata)

Subsetting

The returned object has the class remote_subcorpus.

Simple count

The count()-method works for remote_subcorpus objects, too.

Next steps

Upcoming versions of polmineR will expose further functionality. This is a simple proof-of-concept!