Skip to content

migration for inactive user purge #6676

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

smithellis
Copy link
Contributor

1 - get all users who we consider inactive (3 years since login)
2 - divide into users having content and users without content
3 - hard deletes non-content users using the _base_manager
4 - pushes other users through the deletion pipeline
5 - implements batching to manage resources

Copy link
Contributor

@escattone escattone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r+wc

@escattone
Copy link
Contributor

@smithellis Forgot to mention one more thing. Just FYI, you can use migrations.RunPython.noop instead of defining a reverse function that does nothing. It's equivalent.


cutoff_date = timezone.now() - timedelta(days=3*365)

query = User.objects.filter(last_login__lt=cutoff_date).annotate(has_content=has_content_criteria)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you using annotate here instead of filter/exclude/Exists?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot pass a Q object directly to annotate(). Not at least in version 4.2.+ that we are using. Annotation in 5.2 vs 4.2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This use case absolutely works - I think it's just not specifically called out in the 4.2 docs but is in later docs. I can run this query and see the output and it builds valid sql which executes properly.

I'm annotating here so we can later divide our users into those with and those without content, so we can execute a quicker delete process on non-content users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this to two separate queries.

"""
Delete users who haven't logged in for over three years.
"""
User = get_user_model()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be User = apps.get_model("auth", "User") in migrations similar to the previous one (0034)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is necessary because the delete_user_pipeline function needs an actual User instance vs. a historical model.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we cannot use the apps.get_model() here, probably we should refactor and change our approach. The migration needs to work the the model state and the time of its creation. If in the future the user model changes, the migration will most probably fail.

I would suggest to either refactor the migration in order to use methods compatible with the historical model (eg avoid properties and custom managers) or consider a different approach altogether.

Copy link
Collaborator

@akatsoulas akatsoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good but we should use the historical user model here.

"""
Delete users who haven't logged in for over three years.
"""
User = get_user_model()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we cannot use the apps.get_model() here, probably we should refactor and change our approach. The migration needs to work the the model state and the time of its creation. If in the future the user model changes, the migration will most probably fail.

I would suggest to either refactor the migration in order to use methods compatible with the historical model (eg avoid properties and custom managers) or consider a different approach altogether.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants