X Tutup
Skip to content

Use text format instead of binary for the 'text' type for sending? #407

@Emill

Description

@Emill

Hi. I just came across the two different methods for reading text datatype input at the PostgreSQL backend:
https://github.com/postgres/postgres/blob/e246b3d6eac09d0770e6f68e69f2368d02db88af/src/backend/utils/adt/varlena.c#L485
and
https://github.com/postgres/postgres/blob/e246b3d6eac09d0770e6f68e69f2368d02db88af/src/backend/utils/adt/varlena.c#L511

It seems that the binary variant does an extra allocation and copy than the text variant. The text variant just calls cstring_to_text directly with the buffer, which in turn calls cstring_to_text_with_len(s, strlen(s)). (Even though the length is already known, it's written as a header in the query...)

The binary variant on the other hand, first calls pq_getmsgtext which performs an allocation, copy and sets a trailing null-character (even though the original buffer already has a trailing null-character...). This is then fed to the cstring_to_text_with_len which simply (again) copies this string to a newly allocated buffer that doesn't even look at the trailing null-character but instead inserts some other kind of header.

I think I can't believe this. For small strings it probably doesn't matter but if the user sends large strings, maybe some hundreds of megabytes, this should hurt since the RAM usage is doubled when the string is copied around.

These two methods are called from the Bind message handler: https://github.com/postgres/postgres/blob/e246b3d6eac09d0770e6f68e69f2368d02db88af/src/backend/tcop/postgres.c#L1572

So.. if someone could prove I'm not mistaken, I think we should always send "text" parameters in text format and not binary.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    X Tutup