Skip to content

Proof-of-concept: Protect decrypted relation keys #446

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: TDE_REL_17_STABLE
Choose a base branch
from

Conversation

AndersAstrand
Copy link
Collaborator

@AndersAstrand AndersAstrand commented Jun 20, 2025

I just finished this up yesterday and have not yet run any performance tests at all. Maybe this is way too slow and not a viable solution at all. I will run performance tests on monday. But any comments on the approach is highly appreciated either way 🙂

The main idea is that the relation cache only contains still encrypted keys, and that any decrypted keys are put in memory that's allocated using openssl's secure allocators to make sure they're never swapped or present in a core dump et.c.

It uses ResourceOwner to ensure we're not leaking memory even though we use a custom allocator.

WAL keys were ignored in this PoC, but they of course need the same treatment if we move forward with this.

Using openssl's secure allocators for these ensures they are never
swapped to disk or present in a core dump.

Using ResourceOwner for them ensures we don't leak memory when
transactions are aborted.

This commit is mostly a poc for only keeping encrypted keys in the smgr
cache, as that memory is not protected.
@codecov-commenter
Copy link

codecov-commenter commented Jun 20, 2025

Codecov Report

Attention: Patch coverage is 85.93750% with 18 lines in your changes missing coverage. Please review.

Project coverage is 84.55%. Comparing base (5610639) to head (baf9640).

❌ Your project status has failed because the head coverage (84.55%) is below the target coverage (90.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@                  Coverage Diff                  @@
##           TDE_REL_17_STABLE     #446      +/-   ##
=====================================================
- Coverage              84.66%   84.55%   -0.12%     
=====================================================
  Files                     21       22       +1     
  Lines                   2589     2674      +85     
  Branches                 402      408       +6     
=====================================================
+ Hits                    2192     2261      +69     
- Misses                   316      327      +11     
- Partials                  81       86       +5     
Components Coverage Δ
access 82.20% <97.36%> (+1.08%) ⬆️
catalog 88.22% <ø> (ø)
common 77.77% <ø> (ø)
encryption 74.68% <77.77%> (+1.23%) ⬆️
keyring 72.88% <ø> (ø)
src 91.44% <ø> (ø)
smgr 91.57% <84.44%> (-3.31%) ⬇️
transam ∅ <ø> (∅)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@AndersAstrand
Copy link
Collaborator Author

(also ignore the infinite recursion I realized in the shower that I left in there :D )

@AndersAstrand
Copy link
Collaborator Author

AndersAstrand commented Jun 24, 2025

Did some preliminary performance testing and I see about a 0.4% slowdown compared to our current version as measured in how many queries sysbench can perform during 20 seconds on my laptop using the oltp_read_write test.

To me this could just be noise in the testing, but indicates that this is not insanely slow at least.

Copy link
Collaborator

@jeltz jeltz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not happy about the overhead of a change like this and I do not understand what we win. We could achieve the same thing with less overhead if we use the .smgr_close function or with even less if we add a .smgr_destroy function.

Assert(principal_key);

*rel_key_data = map_entry->enc_key;

if (!AesGcmDecrypt(principal_key->keyData,
map_entry->entry_iv, MAP_ENTRY_IV_SIZE,
(unsigned char *) map_entry, offsetof(TDEMapEntry, enc_key),
(uint8 *) &rlocator, sizeof(RelFileLocator),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change to uint8 seems very arbitrary. Doesn't PostgreSQL use unsigned char everywhere for this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said I prefer uint8 over unsigned char, but that is not what I think we or PostgreSQL are using. So now you introduced a new way. Also it does seem you have not changed AesGcmDecrypt.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in pg_tde_tdemap.c was just to support the way I calculated aad in the other file, so that rotations wouldn't break everything. They are very PoC.

Copy link
Collaborator Author

@AndersAstrand AndersAstrand Jun 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The postgres code uses uint8 about as commonly as unsigned char. I picked the one I like :D. The type wasn't the point there at all but instead to use the full rlocator as aad.

For WAL keys we currently use a fake rlocator that's the same for all of them so they need something else (probably start_lsn).

EDIT: excluding contrib/ unsigned char seems to be used about 66% of the time.

@AndersAstrand
Copy link
Collaborator Author

AndersAstrand commented Jun 30, 2025

I am not happy about the overhead of a change like this and I do not understand what we win. We could achieve the same thing with less overhead if we use the .smgr_close function or with even less if we add a .smgr_destroy function.

What we win? Decrypted key material is protected. That's the whole win. How we free them (ResourceOwners vs smgr_close) is kind of irrelevant for that.

EDIT: A secondary win is moving away from the horrible InternalKey structure. But that's secondary :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants