-
Notifications
You must be signed in to change notification settings - Fork 8
FEAT: Bulk Copy Wrapper #108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: jahnvi/bulk_copy_main_apis
Are you sure you want to change the base?
FEAT: Bulk Copy Wrapper #108
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no explicit thread safety; if the same BCPWrapper object is used from multiple threads, state flags and ODBC calls could race.
Some methods (e.g., bcp_control int overload) throw exceptions on error, others (e.g., bcp_control wstring overload, read_format_file) just return SQL_ERROR. For a pybind11-exposed wrapper, it's better to throw exceptions for all error states, so Python users get errors, not silent failures.
Recommendation:
Add a note/documentation about thread-safety, or add a mutex if needed.
Always throw std::runtime_error on unrecoverable errors.
} | ||
} | ||
|
||
BCPWrapper::~BCPWrapper() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Destructor currently only logs. If possible, ensure that resources are reliably released (calling close() or finish() if not already done), possibly via RAII.
@@ -0,0 +1,361 @@ | |||
#include "bcp_wrapper.h" // Includes ddbc_bindings.h (and thus sql.h, sqlext.h, <string>, <memory>) and connection.h |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment Style Suggestions
- Start with a brief summary of what a block does, especially if it’s not immediately obvious.
- Use inline comments for tricky lines, and block comments for larger logic sections.
- If you’re handling edge cases (like empty strings, or non-ASCII data), call it out explicitly.
- For all conversions between types (especially between Python/C++ or encodings), note assumptions and possible gotchas for future readers.
} | ||
|
||
SQLRETURN BCPWrapper::bcp_control(const std::wstring& property_name, const std::wstring& value) { | ||
if (!_bcp_initialized || _bcp_finished) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a failed property lookup or type mismatch only logs and returns SQL_ERROR, does not throw. For consistency, consider always throwing on logic errors (as you do for the int version).
|
||
std::string value_utf8 = py::cast(value).cast<std::string>(); | ||
// Fix the BCPHINTS issue: use the wide string directly rather than converting to narrow | ||
SQLRETURN ret = BCPControlW_ptr(_hdbc, it->second.option_code, (LPVOID)value_utf8.c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If ODBC expects UTF-16/SQLWCHAR* for a property like BCPHINTS, passing a UTF-8 narrow string pointer is incorrect, especially for non-ASCII data.
You should convert std::wstring to a buffer of SQLWCHAR (UTF-16), and pass that pointer instead.
std::vector wide_value(value.c_str(), value.c_str() + value.size() + 1);
SQLRETURN ret = BCPControlW_ptr(_hdbc, it->second.option_code, (LPVOID)wide_value.data());
The way wide-string values are currently handled for BCP control is not robust for non-ASCII input and will cause bugs in some scenarios.
INT get_bcp_direction_code(const std::wstring& direction_str) { | ||
if (direction_str == L"in") return DB_IN; | ||
if (direction_str == L"out" || direction_str == L"queryout") return DB_OUT; | ||
throw std::runtime_error("Invalid BCP direction string: " + py::cast(direction_str).cast<std::string>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make the string matching case-insensitive for better robustness.
If user passes "IN" instead of "in" (capitalization), it will throw.
This pull request introduces support for Bulk Copy Program (BCP) functionality in the
mssql_python
project. The changes include adding a newBCPWrapper
class, integrating BCP APIs, and modifying existing classes to support BCP operations. Below is a summary of the most important changes:Integration of BCP Functionality
BCPWrapper
Class: AddedBCPWrapper
inmssql_python/pybind/bcp/bcp_wrapper.{cpp,h}
to encapsulate BCP operations such as initialization, control, column formatting, and execution. This class interfaces with ODBC BCP APIs and handles diagnostic messages for error reporting.BCPInitW
,BCPControlW
,BCPReadFmtW
, etc.) inddbc_bindings.cpp
. These are used byBCPWrapper
to interact with BCP operations.Modifications to Support BCP
CMake Configuration: Updated
CMakeLists.txt
to include the newbcp_wrapper.cpp
file and its corresponding header directory. This ensures the new functionality is compiled and linked.Connection Class Enhancements: Added methods to retrieve the database connection handle (
getDbcHandle
) and theConnection
object (getConnection
) inconnection.h
. These are required byBCPWrapper
to access the underlying ODBC connection handle.BCP Attribute Setting: Modified
applyAttrsBefore
inconnection.cpp
to support setting theSQL_COPT_SS_BCP
attribute before establishing a connection. This enables bulk copy mode for connections.General Updates
bcp_wrapper.h
toddbc_bindings.cpp
to ensure the newBCPWrapper
class and BCP-related functionality are accessible.These changes collectively add robust support for BCP operations, enabling efficient bulk data transfer in the
mssql_python
library.ADO