sparkwarc: Load WARC Files into Apache Spark

Load WARC (Web ARChive) files into Apache Spark using 'sparklyr'. This allows to read files from the Common Crawl project <http://commoncrawl.org/>.

Version: 0.1.5
Imports: DBI, sparklyr, Rcpp
LinkingTo: Rcpp
Published: 2020-12-15
Author: Yitao Li ORCID iD [aut, cre], Javier Luraschi [aut]
Maintainer: Yitao Li <yitao at rstudio.com>
BugReports: https://github.com/r-spark/sparkwarc
License: Apache License 2.0
NeedsCompilation: yes
SystemRequirements: C++11
Materials: README
CRAN checks: sparkwarc results

Documentation:

Reference manual: sparkwarc.pdf

Downloads:

Package source: sparkwarc_0.1.5.tar.gz
Windows binaries: r-devel: sparkwarc_0.1.5.zip, r-devel-UCRT: sparkwarc_0.1.5.zip, r-release: sparkwarc_0.1.5.zip, r-oldrel: sparkwarc_0.1.5.zip
macOS binaries: r-release (arm64): sparkwarc_0.1.5.tgz, r-release (x86_64): sparkwarc_0.1.5.tgz, r-oldrel: sparkwarc_0.1.5.tgz
Old sources: sparkwarc archive

Linking:

Please use the canonical form https://CRAN.R-project.org/package=sparkwarc to link to this page.