The behavior listed is working as intended when PythonProtobuf is backed by upb.
The reason why is because upb's memory model is implemented using an Arena memory model (as described https://en.wikipedia.org/wiki/Region-based_memory_management
). Under this model, you do no book-keeping on every individual allocation, instead there's one pool that you can only append to, and the only time that memory is freed is if that entire pool is released (because there's no book-keeping about what fine-grained memory is live or not). This has both less allocation and deallocation overhead, as well as less memory usage from bookkeeping, by having everything be in the single blob of memory which is much cheaper to drop.
In the upb model, each new top level message is holding this pool and so anything added has to stay live until that thing is released.
This is a known tradeoff: for the expected usecases of Protobuf where you have a lot of request-scoped messages (and some number of permanent immutable constants), it will be faster and use less memory, with an unavoidable the downside is that if you do have a long lived mutable object that is doing allocating modifications, the memory won't be released until you finally release that one object.
In a pinch if you really need some long lived constantly-allocating message, you can use CopyFrom() into a 'fresh' parent to basically reset it so what you have in the arena is exactly only the data that is actually reachable at that moment, at the cost of doing one deep copy of whatever that state is.