I’m working with PostgreSQL for my day job, and liking it.
We’re fairly heavy users of stored procedures implemented in PL/Perl, with ~10,000 lines in ~100 functions (some of which have bloated to painful proportions). This creates some interesting issues and challenges for us.
There’s a window of opportunity now to make improvements to PL/Perl for PostgreSQL 8.5. I’m planning to work with Andrew Dunstan to agree on a set of changes and develop the patches.
As a first step along that road I want to map out here the changes I’m thinking of and to ask for comments and suggestions.
Goals:
- Enable modular programming by pre-loading user libraries.
- Soften the hard choice between plperl and plperlu, so there’s less reason to “give up” and use plperlu.
- Improve performance.
- Improve flexibility for future changes.
- Enable use of tracing/debugging tools.
Specific Proposals:
- Enable configuration of perl at initialization
- Configure extra items to be shared with the Safe compartment
- Permit some more opcodes in the Safe compartment
- Execute END blocks at process end
- Name PL/Perl functions
- Miscellaneous updates to the PL/Perl documentation
- Improve Performance
Add ability to specify in postgresql.conf some code to be run when a perl interpreter is initialized. For example:
plperl.at_init_do = 'use lib qw(/path/to/mylib); use MyPlPerlUtils; use List::Util qw(sum);'
The Safe compartment used for plperl functions can’t access any namespace outside the compartment. So, by default, any subroutines defined by libraries loaded via plperl.at_init_do won’t be callable from plperl functions.
Some mechanism is needed to specify which extra subroutines, and/or variables, should be shared with the Safe compartment. For example:
plperl.safe_share = '$foo, myfunc, sum'
I’d like to add the following opcodes to the set of opcodes permitted in the Safe compartment: caller, dbstate, tms.
Currently PostgreSQL doesn’t execute END blocks when the backend postgres process exits (oddly, it actually executes them immediately after initializing the interpreter). Fixing that would greatly simplify use of tools like NYTProf that need to know when the interpreter is exiting. Updated: used to say “at server shutdown” which was wrong.
Currently PL/Perl functions are compiled as anonymous subroutines. Applying the same technique as the Sub::Name module would allow them have ‘more useful’ names than the current ‘__ANON__’.
For a PL/Perl function called “foo”, a minimal implementation would use a name like “foo__id54321″ where 54321 is the oid of the function. This avoids having to deal with polymorphic functions (where multiple functions have the same name but different arguments).
The names won’t enable inter-function calling and may not even be installed in the symbol table. They’re just to improve error messages and to enable use of tools like Devel::NYTProf:: PgPLPerl (as yet unreleased).
To document the new functionality and expand/update the related text.
It seems likely that there’s room for improvement. Some code profiling is needed first, though, so I’ll leave this one vague for now.
Any comments on the above?
Anything you’d like to add?
If so, speak up, time is short!
Footnote
For completeness I’ll mention that I was thinking of adding a way to permit extra opcodes (plperl.safe_permit=’caller’) and a way to use a subclass of the Safe module (plperl.safe_class=’MySafe’). I dropped them because I felt the risks of subtle security issues outweighed the benefits. Any requirements for which these proposals seem like a good fit can also be met via plperl.at_init_do and plperl.safe_share.
You do realize that this would be impossible, right? That interpreters are created per-connection and no interpreter state can be preserved past the end of the session (each session is a separate process)?
Most of the rest of your points would be better addresses at the perl level (Safe is a joke compared to, for example, TCL’s safe interpreters).
Comment by Andrew G. — October 6, 2009 @ 4:08 am |
Thank you, yes, I mean at the end of the process. I’m not sure what you mean by “Most of the rest of your points would be better addressed at the perl level”.
Comment by TimBunce — October 6, 2009 @ 8:40 am |
Points 1-3 are primarily attempts to work around the lack of features that should have been in Safe.pm in the first place.
For comparison, in TCL if you create a safe interpreter using ::safe::interpCreate, then the safe interpreter can load modules etc. in the normal way, subject to (a) the specified mapping between host filenames and the virtual filenames seen inside the safe interpreter, and (b) the fact that the loaded modules are still running inside the safe interpreter and have no more access than it does.
A proper system of external interfaces for Safe would obviate the need for many of the features you are requesting.
Incidentally, is the sort{} + threads + Safe bug (#60374) fixed yet? (and if it is, why is it still open?)
Comment by Andrew G. — October 6, 2009 @ 12:31 pm |
It’s not fixed in 5.10.1. I’ve asked for an update on the status. Meanwhile I’ve added a comment on the ticket with a simple workaround that would be effective for plperl and that I’ll include in my changes.
Comment by TimBunce — October 6, 2009 @ 8:03 pm
I don’t have anything smart to write here – but I wanted to express my support, especially for the point 1 above. It was a long time ago that I learned about PL/Perl and I was initially thrilled about the possibilities – but after I learned about the constraints for loading modules I realized that it’s application is very limited.
Comment by Zbigniew Lukasiak — October 6, 2009 @ 8:01 am |
Suggested change: rename plperl.at_init_do to plperl.at_perl_init_do and add plperl.at_safe_init_do which would be similar but specify code to run when the Safe compartment is initialized. It could thus be set by per-user/per-role GUC.
Comment by TimBunce — October 6, 2009 @ 8:40 pm |
Item 1 could be controversial. It’s undoubtedly useful, but it creates an action at a distance that effectively invites users to create mutually incompatible PL/Perl installation. Ideally, you would want users to create libraries of reusable PL/Perl functions, but when those only work with certain at_init_do settings, then you create a big mess. Look at PHP; they have done something quite similar with their php.ini.
Comment by Peter Eisentraut — October 7, 2009 @ 12:24 pm |
I see your point. What’s needed is some equivalent of a “use” statement that can be put at the start of plperl functions. I have a few ideas – I’ll give it some thought. Thanks.
Comment by TimBunce — October 7, 2009 @ 10:13 pm |