Skip to content

test_df_pivot tests failed at the system_prerelease session #341

Open
@chelsea-lin

Description

@chelsea-lin

After fixing failed tests mentioned #337, still have other tests failed:

FAILED tests/system/small/test_dataframe.py::test_dataframe_bool_aggregates[all_axis0] - AssertionError: Series.index are different
FAILED tests/system/small/test_dataframe.py::test_dataframe_bool_aggregates[any_axis0] - AssertionError: Series.index are different
FAILED tests/system/small/test_dataframe.py::test_df_pivot[values2-int64_too-columns2] - AssertionError: DataFrame.iloc[:, 0] (column name="('int64_col', <NA>)") are different
FAILED tests/system/small/test_groupby.py::test_dataframe_groupby_analytic[cumprod] - AssertionError: DataFrame.iloc[:, 1] (column name="int64_col") are different
FAILED tests/system/small/test_dataframe.py::test_to_pandas_downsampling_option_override - assert 1.3427486419677734 == 1 ± 3.0e-01
FAILED tests/system/small/test_series.py::test_series_add_prefix - AssertionError: Series.index are different
FAILED tests/system/small/test_series.py::test_series_add_suffix - AssertionError: Series.index are different
FAILED tests/system/small/test_series.py::test_groupby_window_ops[cumprod] - AssertionError: Series are different
FAILED tests/system/small/test_series.py::test_string_astype_int - AssertionError: Series.index are different

@tswast mentioned, the distinction of RangeIndex vs Int64Index issue could be unblock by setting check_index_type=False: https://pandas.pydata.org/docs/reference/api/pandas.testing.assert_series_equal.html#pandas.testing.assert_series_equal

But the iloc errors may be real issues. The callstack is shown as below:

=================================== FAILURES ===================================
__________________ test_df_pivot[values2-int64_too-columns2] ___________________

scalars_dfs = (          bool_col                                          bytes_col  \
rowindex                                    ......  2038-01-19 03:14:17.999999+00:00
8            False  ...                              <NA>

[9 rows x 13 columns])
values = ['int64_col', 'float64_col'], index = 'int64_too'
columns = ['string_col']

    @pytest.mark.parametrize(
        ("values", "index", "columns"),
        [
            ("int64_col", "int64_too", ["string_col"]),
            (["int64_col"], "int64_too", ["string_col"]),
            (["int64_col", "float64_col"], "int64_too", ["string_col"]),
        ],
    )
    def test_df_pivot(scalars_dfs, values, index, columns):
        scalars_df, scalars_pandas_df = scalars_dfs
    
        bf_result = scalars_df.pivot(
            values=values, index=index, columns=columns
        ).to_pandas()
        pd_result = scalars_pandas_df.pivot(values=values, index=index, columns=columns)
    
        # Pandas produces NaN, where bq dataframes produces pd.NA
>       pd.testing.assert_frame_equal(bf_result, pd_result, check_dtype=False)

tests/system/small/test_dataframe.py:2294: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

left = array([nan, nan, nan, nan])
right = array([nan, nan, <NA>, nan], dtype=object), err_msg = None

    def _raise(left, right, err_msg) -> NoReturn:
        if err_msg is None:
            if left.shape != right.shape:
                raise_assert_detail(
                    obj, f"{obj} shapes are different", left.shape, right.shape
                )
    
            diff = 0
            for left_arr, right_arr in zip(left, right):
                # count up differences
                if not array_equivalent(left_arr, right_arr, strict_nan=strict_nan):
                    diff += 1
    
            diff = diff * 100.0 / left.size
            msg = f"{obj} values are different ({np.round(diff, 5)} %)"
>           raise_assert_detail(obj, msg, left, right, index_values=index_values)
E           AssertionError: DataFrame.iloc[:, 0] (column name="('int64_col', <NA>)") are different
E           
E           DataFrame.iloc[:, 0] (column name="('int64_col', <NA>)") values are different (25.0 %)
E           [index]: [-2345, 0, 1, 2]
E           [left]:  [nan, nan, nan, nan]
E           [right]: [nan, nan, <NA>, nan]

.nox/system_prerelease/lib/python3.11/site-packages/pandas/_testing/asserters.py:684: AssertionError

Metadata

Metadata

Assignees

No one assigned

    Labels

    api: bigqueryIssues related to the googleapis/python-bigquery-dataframes API.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions