Open
Description
After fixing failed tests mentioned #337, still have other tests failed:
FAILED tests/system/small/test_dataframe.py::test_dataframe_bool_aggregates[all_axis0] - AssertionError: Series.index are different
FAILED tests/system/small/test_dataframe.py::test_dataframe_bool_aggregates[any_axis0] - AssertionError: Series.index are different
FAILED tests/system/small/test_dataframe.py::test_df_pivot[values2-int64_too-columns2] - AssertionError: DataFrame.iloc[:, 0] (column name="('int64_col', <NA>)") are different
FAILED tests/system/small/test_groupby.py::test_dataframe_groupby_analytic[cumprod] - AssertionError: DataFrame.iloc[:, 1] (column name="int64_col") are different
FAILED tests/system/small/test_dataframe.py::test_to_pandas_downsampling_option_override - assert 1.3427486419677734 == 1 ± 3.0e-01
FAILED tests/system/small/test_series.py::test_series_add_prefix - AssertionError: Series.index are different
FAILED tests/system/small/test_series.py::test_series_add_suffix - AssertionError: Series.index are different
FAILED tests/system/small/test_series.py::test_groupby_window_ops[cumprod] - AssertionError: Series are different
FAILED tests/system/small/test_series.py::test_string_astype_int - AssertionError: Series.index are different
@tswast mentioned, the distinction of RangeIndex vs Int64Index issue could be unblock by setting check_index_type=False
: https://pandas.pydata.org/docs/reference/api/pandas.testing.assert_series_equal.html#pandas.testing.assert_series_equal
But the iloc errors may be real issues. The callstack is shown as below:
=================================== FAILURES ===================================
__________________ test_df_pivot[values2-int64_too-columns2] ___________________
scalars_dfs = ( bool_col bytes_col \
rowindex ...... 2038-01-19 03:14:17.999999+00:00
8 False ... <NA>
[9 rows x 13 columns])
values = ['int64_col', 'float64_col'], index = 'int64_too'
columns = ['string_col']
@pytest.mark.parametrize(
("values", "index", "columns"),
[
("int64_col", "int64_too", ["string_col"]),
(["int64_col"], "int64_too", ["string_col"]),
(["int64_col", "float64_col"], "int64_too", ["string_col"]),
],
)
def test_df_pivot(scalars_dfs, values, index, columns):
scalars_df, scalars_pandas_df = scalars_dfs
bf_result = scalars_df.pivot(
values=values, index=index, columns=columns
).to_pandas()
pd_result = scalars_pandas_df.pivot(values=values, index=index, columns=columns)
# Pandas produces NaN, where bq dataframes produces pd.NA
> pd.testing.assert_frame_equal(bf_result, pd_result, check_dtype=False)
tests/system/small/test_dataframe.py:2294:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
left = array([nan, nan, nan, nan])
right = array([nan, nan, <NA>, nan], dtype=object), err_msg = None
def _raise(left, right, err_msg) -> NoReturn:
if err_msg is None:
if left.shape != right.shape:
raise_assert_detail(
obj, f"{obj} shapes are different", left.shape, right.shape
)
diff = 0
for left_arr, right_arr in zip(left, right):
# count up differences
if not array_equivalent(left_arr, right_arr, strict_nan=strict_nan):
diff += 1
diff = diff * 100.0 / left.size
msg = f"{obj} values are different ({np.round(diff, 5)} %)"
> raise_assert_detail(obj, msg, left, right, index_values=index_values)
E AssertionError: DataFrame.iloc[:, 0] (column name="('int64_col', <NA>)") are different
E
E DataFrame.iloc[:, 0] (column name="('int64_col', <NA>)") values are different (25.0 %)
E [index]: [-2345, 0, 1, 2]
E [left]: [nan, nan, nan, nan]
E [right]: [nan, nan, <NA>, nan]
.nox/system_prerelease/lib/python3.11/site-packages/pandas/_testing/asserters.py:684: AssertionError