Skip to content

FEAT: Bulk Copy Wrapper #108

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: jahnvi/bulk_copy_main_apis
Choose a base branch
from

Conversation

jahnvi480
Copy link
Contributor

This pull request introduces support for Bulk Copy Program (BCP) functionality in the mssql_python project. The changes include adding a new BCPWrapper class, integrating BCP APIs, and modifying existing classes to support BCP operations. Below is a summary of the most important changes:

Integration of BCP Functionality

  • New BCPWrapper Class: Added BCPWrapper in mssql_python/pybind/bcp/bcp_wrapper.{cpp,h} to encapsulate BCP operations such as initialization, control, column formatting, and execution. This class interfaces with ODBC BCP APIs and handles diagnostic messages for error reporting.
  • BCP API Pointers: Declared and initialized function pointers for ODBC BCP APIs (BCPInitW, BCPControlW, BCPReadFmtW, etc.) in ddbc_bindings.cpp. These are used by BCPWrapper to interact with BCP operations.

Modifications to Support BCP

  • CMake Configuration: Updated CMakeLists.txt to include the new bcp_wrapper.cpp file and its corresponding header directory. This ensures the new functionality is compiled and linked.

  • Connection Class Enhancements: Added methods to retrieve the database connection handle (getDbcHandle) and the Connection object (getConnection) in connection.h. These are required by BCPWrapper to access the underlying ODBC connection handle.

  • BCP Attribute Setting: Modified applyAttrsBefore in connection.cpp to support setting the SQL_COPT_SS_BCP attribute before establishing a connection. This enables bulk copy mode for connections.

General Updates

  • Header Inclusion: Added bcp_wrapper.h to ddbc_bindings.cpp to ensure the new BCPWrapper class and BCP-related functionality are accessible.

These changes collectively add robust support for BCP operations, enabling efficient bulk data transfer in the mssql_python library.

ADO

Copy link
Contributor

@Ramudaykumar Ramudaykumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no explicit thread safety; if the same BCPWrapper object is used from multiple threads, state flags and ODBC calls could race.

Some methods (e.g., bcp_control int overload) throw exceptions on error, others (e.g., bcp_control wstring overload, read_format_file) just return SQL_ERROR. For a pybind11-exposed wrapper, it's better to throw exceptions for all error states, so Python users get errors, not silent failures.

Recommendation:
Add a note/documentation about thread-safety, or add a mutex if needed.
Always throw std::runtime_error on unrecoverable errors.

}
}

BCPWrapper::~BCPWrapper() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Destructor currently only logs. If possible, ensure that resources are reliably released (calling close() or finish() if not already done), possibly via RAII.

@@ -0,0 +1,361 @@
#include "bcp_wrapper.h" // Includes ddbc_bindings.h (and thus sql.h, sqlext.h, <string>, <memory>) and connection.h
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment Style Suggestions

  • Start with a brief summary of what a block does, especially if it’s not immediately obvious.
  • Use inline comments for tricky lines, and block comments for larger logic sections.
  • If you’re handling edge cases (like empty strings, or non-ASCII data), call it out explicitly.
  • For all conversions between types (especially between Python/C++ or encodings), note assumptions and possible gotchas for future readers.

}

SQLRETURN BCPWrapper::bcp_control(const std::wstring& property_name, const std::wstring& value) {
if (!_bcp_initialized || _bcp_finished) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a failed property lookup or type mismatch only logs and returns SQL_ERROR, does not throw. For consistency, consider always throwing on logic errors (as you do for the int version).


std::string value_utf8 = py::cast(value).cast<std::string>();
// Fix the BCPHINTS issue: use the wide string directly rather than converting to narrow
SQLRETURN ret = BCPControlW_ptr(_hdbc, it->second.option_code, (LPVOID)value_utf8.c_str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ODBC expects UTF-16/SQLWCHAR* for a property like BCPHINTS, passing a UTF-8 narrow string pointer is incorrect, especially for non-ASCII data.
You should convert std::wstring to a buffer of SQLWCHAR (UTF-16), and pass that pointer instead.

std::vector wide_value(value.c_str(), value.c_str() + value.size() + 1);
SQLRETURN ret = BCPControlW_ptr(_hdbc, it->second.option_code, (LPVOID)wide_value.data());

The way wide-string values are currently handled for BCP control is not robust for non-ASCII input and will cause bugs in some scenarios.

INT get_bcp_direction_code(const std::wstring& direction_str) {
if (direction_str == L"in") return DB_IN;
if (direction_str == L"out" || direction_str == L"queryout") return DB_OUT;
throw std::runtime_error("Invalid BCP direction string: " + py::cast(direction_str).cast<std::string>());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make the string matching case-insensitive for better robustness.

If user passes "IN" instead of "in" (capitalization), it will throw.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants