-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] DataFrame from_dict silently fails and generates invalid data with inputs in the form of orient="dict"
#13614
Comments
orient="dict"
orient="dict"
Thanks. This happens because In contrast the import pandas as pd
data = {"a":[10,4,6], "b":[3,5,3]}
df = pd.DataFrame.from_dict(data)
new = pd.DataFrame(df.to_dict()) This is, I think, mostly a consequence of there not being a symmetry in the To summarise the pandas behaviour:
import pandas as pd
df = pd.DataFrame({"a": [1, 2, 3], "b": [3, 4, 5]})
orient = {'dict', 'list', 'series', 'split', 'tight', 'records', 'index'}
for o in sorted(orient):
try:
new = pd.DataFrame.from_dict(df.to_dict(orient=o))
try:
same = (new == df).all().all()
if same:
print(f"# Success for {o}")
else:
raise
except:
print(f"# Read, but data wrong for {o}")
except:
print(f"# Unable to read for {o}")
# Success for dict
# Read, but data wrong for index
# Success for list
# Success for records
# Success for series
# Unable to read for split
# Unable to read for tight So if the In contrast, cudf: import cudf as pd
df = pd.DataFrame({"a": [1, 2, 3], "b": [3, 4, 5]})
orient = {'dict', 'list', 'series', 'split', 'tight', 'records', 'index'}
for o in sorted(orient):
try:
new = pd.DataFrame.from_dict(df.to_dict(orient=o))
try:
same = (new == df).all().all()
if same:
print(f"# Success for {o}")
else:
raise
except:
print(f"# Read, but data wrong for {o}")
except:
print(f"# Unable to read for {o}")
# Read, but data wrong for dict
# Read, but data wrong for index
# Success for list
# Success for records
# Success for series
# Unable to read for split
# Unable to read for tight So the |
This no longer reproduces for me on the latest cudf:
|
cudf.DataFrame.from_dict
was added in #12048 to close #11934 . In at least one scenario,from_dict
fails silently on data generated byto_dict
and generates columns ofrange(0, N)
. We should either succeed or prohibit this input orientation.But will fail silently and generate columns of
range(0, N)
if the data is in theto_dict
default "dict" orientation:In contrast, pandas succeeds:
The text was updated successfully, but these errors were encountered: