Skip to content

Releases: Kotlin/dataframe

v0.15: Experimental new CSV-, and Geographic integrations and many other fixes

09 Dec 14:13
Compare
Choose a tag to compare

This release contains several new features, tons of fixes and two new exciting experimental new integrations:

  • Experimental new CSV parser based on Deephaven-CSV. See below for more information.
  • Experimental new GeoDataFrame class for working with geographical data (from GeoJson/Shapefile) and plotting it with Kandy. See below for more information.
  • Full BigInteger support:
    Just like we support the BigDecimal numbers, DataFrame now also supports BigInteger in parsing, converting, statistics, column arithmetics, etc.
  • Custom SQL Database registration (read user guide)
  • Improved parsing:
    Parsing and converting String columns to other types is now faster.
    We added String -> Char parsing.
    We also introduce the new experimental ParserOptions.useFastDoubleParser setting, which uses FastDoubleParser for faster and more flexible Double parsing.
  • We continue improving our Compiler Plugin with every release. See below for more information.
  • See this notebook for some more information about the changes.

New Experimental CSV integration

DataFrame's CSV parsing has been based on Apache Commons CSV from the beginning. While this has been sufficient for most applications, it had some issues like running out of memory, performance, and our API lacking in clarity, documentation, and completeness.

For DataFrame 0.15, we introduce a new separate package org.jetbrains.kotlinx:dataframe-csv which tries to solve all these issues at once. It's based on Deephaven-CSV which makes it faster and more memory efficient. And since we built it from the ground up, we made sure the API was complete, predictable, and documented carefully.

To try it yourself, explicitly add the dependency org.jetbrains.kotlinx:dataframe-csv to your project. In notebooks you can add enableExperimentalCsv=true to the %use-magic, like %use dataframe(enableExperimentalCsv=true).
Use the new DataFrame.readCsv()/DataFrame.readTsv()/DataFrame.readDelim() functions over the old DataFrame.readCSV() ones.

We happily await your feedback!

New Experimental Geo integration

Kandy v0.8 introduces geo-plotting which allows you to visualize geospatial/geographical data using the awesome Kandy DSL. To make working with this geographical data (from GeoJson/Shapefile) easier, we happily accepted the GeoDataFrame PR from the Kandy team.

To try it yourself, explicitly add the dependency org.jetbrains.kotlinx:dataframe-geo to your project or enableExperimentalGeo=true to your notebook (with the repository maven("https://repo.osgeo.org/repository/release")) and use GeoDataFrame.readGeoJson() or GeoDataFrame.readShapeFile() to get started!

Features

Compiler Plugin

  • [Compiler plugin] Lower frontend generated implicit receivers by @koperagen in #869
  • Generate valid code in transform(call) when interpret(call) fails by @koperagen in #907
  • [Compiler plugin] Support dataFrameOf(Pair<String, List) by @koperagen in #908
  • [Compiler plugin] Add a mechanism to handle function calls to stdlib that can appear as df api arguments by @koperagen in #914
  • [Compiler plugin] Generate ColumnName annotations on frontend for all names that contain illegal characters by @koperagen in #913
  • Revert insertGenericTreeImpl by @koperagen in #923
  • [Compiler plugin] Propagate nullability in toDataFrame tree conversion by @koperagen in #942
  • Add castTo(Function) overload for workflows that use compiler plugin by @koperagen in #948
  • [Compiler plugin] Setup call transformer pipeline to handle (...) -> DataRow functions by @koperagen in #918
  • Compiler plugin read improvements by @koperagen in #949
  • [Compiler plugin] Support valueCounts by @koperagen in #951

Fixes

Docs and Examples

New Contributors

Full Changelog: v0.14.2...v0.15.0

v0.14: Kotlin 2.0 and many stability improvements

23 Sep 10:25
Compare
Choose a tag to compare

This release can mostly be described as a quality-of-life release. While there are not many new groundbreaking features at the moment, almost every part of the library has had some improvement. See the full list of changes below, but to highlight a few:

  • We now officially support Kotlin 2.0+. The library is built with 2.0.20 now, so it will work with KSP 2.0.20 too.
  • We've continued our work on the DataFrame Kotlin Compiler Plugin. While it is still experimental, it introduces an exciting new approach to working with your data in a zero-boilerplate, type safe way leveraging the amazing power the Kotlin 2.0 compiler gives us. See this demo project to experiment with it yourself.
  • See this notebook for some of the small yet exciting features of the 0.14 release!

0.14.2

Includes the fix: #934 which removes the slf4j-simple dependency, keeping just slf4j-api.

0.14.1

Includes the fix: #872 which fixes compatibility with Kandy v0.7.1.

Features

  • Compiler plugin by @koperagen in #729
  • added toDataFrame for float- and double iterables by @Jolanrensen in #631
  • Allow any ArrowReader implementation to be use for reading Arrow data #627 by @fb64 in #628
  • add random parameter to shuffle by @koperagen in #643
  • apply ksp to multiplatform configs in multiplatform modules by @mgroth0 in #647
  • Add separator parameter to DataFrame.flatten by @zaleslaw in #667
  • POJO toDataFrame support (and array improvements) by @Jolanrensen in #650
  • Add JDBC credentials extraction from env variables and improve exception handling by @zaleslaw in #692
  • Added MS SQL support for the dataframe-jdbc module by @zaleslaw in #689
  • Update SQL all table/schemas reading functions to return maps with table names by @zaleslaw in #718
  • Add a support for H2 modes by @zaleslaw in #720
  • Add delimiter parameter to readDelimStr by @koperagen in #743
  • Add an option to read Excel cell values as a String regardless of their content type by @koperagen in #745
  • Add castTo to help working with implicitly generated schemas in notebooks and plugin by @koperagen in #747
  • Replace Klaxon with kotlinx-serialization by @devcrocod in #603
  • Add df.convertTo(schemaFrom) overload by @koperagen in #764
  • Add Convert.asFrame function by @koperagen in #781
  • Add extension functions for the ResultSet by @zaleslaw in #772

Work on the compiler plugin

  • then operation in pivot column selection DSL inside aggregate by @koperagen in #617
  • Compiler plugin fixes by @koperagen in #740
  • Compiler plugin update by @koperagen in #755
  • Adding utils to help ensure that compile time schema ~ runtime schema by @koperagen in #767
  • Improve codegen for stdlib <-> df interop workflow by @koperagen in #763
  • Add initial support for CS DSL in the compiler plugin by @koperagen in #783
  • Refactor toDataFrame implementation in compiler plugin by @koperagen in #782
  • [Compiler plugin] Avoid throwing debugging exceptions in user projects because of false positives by @koperagen in #788
  • [Compiler plugin] silently abort interpretation in case of invariant errors by @koperagen in #812
  • [Compiler plugin ] Support ColumnName annotation in extension properties codegen by @koperagen in #818
  • Update compiler plugin by @koperagen in #832

Fixes

Docs and Examples

Read more

0.13.1 Columns Selection DSL, KDocs, Table Rendering, and Many Fixes!

19 Mar 18:31
Compare
Choose a tag to compare

We just released v0.13.1!

Documentation/Readme might not be up-to-date yet. However, feel free to test it in your project and let us know if something does not work as expected!

To sum up the most significant changes:

  • We finally merged the new ColumnsSelection DSL (documentation will come too). This comes with KDocs everywhere, previously missing overloads, and clearer function names.
  • We're continuously improving support for Kotlin Notebook by creating and improving a native table component for DataFrames, by @ermolenkodev
  • (Nested) table output now looks better for notebooks on Github (for example)
  • Improvements to Arrow reading (thanks, @fb64 and @Kopilov!)
  • Many other fixes and version bumps, see below!

Check it out on Maven Central

Features

Fixes

Docs and Examples

Version Updates

New Contributors

Known Issues

Full Changelog: v0.12.1...v0.13.1

0.12.1 Bug Fix: Improved SQL type to Kotlin types mapping

18 Jan 13:04
Compare
Choose a tag to compare

Read the JDBC support documentation

What's Changed

Full Changelog: 0.12...v0.12.1

0.12: SQL Databases as a Data Source via JDBC

18 Oct 14:26
Compare
Choose a tag to compare

Read the JDBC support documentation

What's Changed

Full Changelog: build-0.11.1...0.12

0.11.1

07 Aug 12:39
Compare
Choose a tag to compare

Hotfix for Kotlin notebook code generation 17c52f5

0.11.0 Onboarding documentation update and minor API changes

26 Jun 12:18
Compare
Choose a tag to compare

What's Changed

Full Changelog: build-0.10.1...build-0.11.0

0.10.1 Bug fix: Android compatibility and KSP updates

23 May 11:59
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: build-0.10.0...build-0.10.1

Dataframe 0.10.0

11 Apr 17:48
Compare
Choose a tag to compare

New version targeting Kotlin 1.8.20 and KSP 1.8.20-1.0.10

KDocs were introduced in many places, so check them out in the IDE!

Along with that, now you can see the result of most operations in the documentation in the form of interactive tables. Now it should be much clearer what is going on even for relatively complex operations such as pivot.

Known issues

There's an issue with incremental compilation in the KSP 1.8.20-1.0.10 that sometimes leads to build errors when using our Gradle plugin. If you are experiencing this problem, try disabling incremental compilation or stick to some older version, for example 0.10.0-dev-1532

New API

Check out the updated dataframe rendering API if you want to customize your outputs in the notebook or want to save or display dataframes in HTML format.

Auto Generated What's Changed

New Contributors

Full Changelog: build-0.9.1...build-0.10.0

Dataframe 0.9.1

26 Jan 12:12
66af0ab
Compare
Choose a tag to compare

Kotlin Dataframe 0.9.1 released!

Blog post: https://blog.jetbrains.com/kotlin/2023/01/kotlin-dataframe-0-9-1-released/

Get it on Maven Central!

TL;DR

Auto Generated What's Changed

New Contributors

Full Changelog: build-0.8.0...build-0.9.1