GitHub - FRosner/drunken-data-quality at 1.2.0

15 Branches 14 Tags

Name	Name	Last commit message	Last commit date
Latest commit FRosner #17 remove SNAPSHOT from version Aug 14, 2015 c8c2eef · Aug 14, 2015 History 54 Commits
src	src	#25 adjust test name	Aug 14, 2015
.gitignore	.gitignore	#1 ignore gedit stuff	Jul 14, 2015
.travis.yml	.travis.yml	#2 basic .travis.yml	Jul 14, 2015
LICENSE	LICENSE	Initial commit	Jul 14, 2015
README.md	README.md	#18 extend README with some better examples	Jul 16, 2015
build.sbt	build.sbt	#17 remove SNAPSHOT from version	Aug 14, 2015

Repository files navigation

Drunken Data Quality (DDQ)

Description

DDQ is a small library for checking constraints on Spark data structures. It can be used to assure a certain data quality, especially when continuous imports happen.

Getting DDQ

In order to use DDQ, you can add it as a dependency to your project using JitPack.io. Just add it to your build.sbt like this:

resolvers += "jitpack" at "https://jitpack.io"

libraryDependencies += "com.github.FRosner" % "drunken-data-quality" % "x.y.z"

If you are not using any of the dependency management systems supported by JitPack, feel free to download one of the compiled artifacts in the release section. Alternatively you may of course also build from source.

Using DDQ

import de.frosner.ddq._

val customers = sqlContext.table("customers")
val contracts = sqlContext.table("contracts")
Check(customers)
  .hasNumRowsEqualTo(100000)
  .isNeverNull("customer_id")
  .hasUniqueKey("customer_id")
  .satisfies("customer_age > 0")
  .isConvertibleToDate("customer_birthday", new SimpleDateFormat("yyyy-MM-dd"))
  .hasForeignKey(contracts, "customer_id" -> "contract_owner_id")
  .run()

Authors

Frank Rosner (Creator)
Slavo N. (Contributor)

License

This project is licensed under the Apache License Version 2.0. For details please see the file called LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Drunken Data Quality (DDQ)

Description

Getting DDQ

Using DDQ

Authors

License

About

Releases 14

Packages

Contributors 4

Languages

License

FRosner/drunken-data-quality

Folders and files

Latest commit

History

Repository files navigation

Drunken Data Quality (DDQ)

Description

Getting DDQ

Using DDQ

Authors

License

About

Resources

License

Stars

Watchers

Forks

Releases 14

Packages 0

Contributors 4

Languages

Packages