Tuesday, December 08, 2009

The CREATE ANY TRIGGER three card trick.

I've got an interest in security, or at least in vulnerabilities. It isn't something I've really had as part of a job role but I like the tricky way of thinking it entails, and I follow a number of security blogs (both Oracle and generic).

During my degree course, when I was mooching around the Prime computers the students had accounts on I discovered some forgotten directories which allowed me to stop worrying about the number and size of files I wanted to store. My degree course was a 'sandwich' one, where the third year was spent working in the real world, and while there I discovered another 'hole'. In that case it was on a *nix system where, although the login account went straight to a menu rather than giving shell access, I could execute shell commands from within the text editor.

After graduating, I was working for a software provider. One client had an important batch job which failed to run at a weekend. Some clients had support accounts that we could access anytime, others required the account to be unlocked, and one site had to turn on the modem before we could connect. This site was quite secure and I couldn't get to the support account but it was local enough that I could drive to the site. The guy I was dealing with had a basic login with just a list of menu options so I still didn't have any command line or SQL level access to fix the problem. Remembering my previous adventures, I nipped into the MAIL option and from there to the text editor. [Hey, this was pre-internet and pretty much pre-Windows. You sent and read your mail through green-screen terminals.] This was VMS and the environment was smarter than the *nix environment had been and knew I was limited to a locked down account, so the editor wouldn't allow shell commands. But it would let me open up another file in the editor, and so I could edit a file that would be run by the batch job when it restarted. See what I mean about twisty thinking ?

DBAs need to understand the twisty thinking when it comes to protecting databases. So I though I'd push out an Oracle example of twisty thinking.

DBAs know that the *ANY* privileges are generally dangerous and shouldn't be handed out unless absolutely necessary (and even then, with lots of auditing and a large spiky stick). CREATE ANY TRIGGER is one of the trickier ones. On the face of it, it allows a user to create a trigger on anyone's table (or a DDL or event trigger). The trick is that, table triggers do NOT need to be owned by the same schema as that of the table.

So a user, DICK, with 'CREATE ANY TRIGGER' can create a trigger owned by TOM on a table owned by HARRY. That way anyone who inserts, updates or deletes a row in HARRY's table will fire the trigger that runs with TOM's privileges. And given DICK can create the trigger with any owner he wants, the TOM account will be the one with the most privileges.

Obviously, DBAs don't go around granting CREATE ANY TRIGGER to just anyone. A measly developer may just get an account with CREATE TABLE, but that's about it. Once you consider that the developer's account can be the table owner though (HARRY in the example), we have our first twist. A sneaky cracker will need a powerful DICK to create the trigger so he looks around for an account that already has that privilege, such as MDSYS (which owns the spatial datatypes, packages etc). That account is probably LOCKED and EXPIRED, but that doesn't make it unusable. A cracker will be studying the Oracle CPUs and security sites for SQL or PL/SQL injection vulnerabilities in those privileged (and locked down) accounts and those will be used to get access to the CREATE ANY TRIGGER privilege.

To summarise, cracker gets a lowly priviliged HARRY account, uses vulnerabilities in packages owned by default DICK users (despite them being locked) to create triggers owned by TOM users with very high privileges on HARRY's own tables. By inserting into those tables, the cracker can fire a trigger to grant himself DBA or whatever.

From there you can use CREATE ANY DIRECTORY to start hunting around the server file system, and write ssh or rhosts files to get to the oracle login. That may only be on a development system, but some snazzy rootkit style executables there could help the cracker make the jump to production. Or maybe, by pointing the CREATE ANY DIRECTORY cannon at the directory with the Oracle data files, they use UTL_FILE to read the files/blocks relating to C_USER# (the data segment underlying dba_users) and get the password hashes (which are no longer visible in DBA_USERS in 11g). There are some fun things being done to speed up cracking of Oracle passwords so anyone thinking the eight-character alphanumeric password they have been using for a couple of months is safe should think again. You don't use the same password for anything important do you ?

So, have a think about your password complexity and change requirements, applying those CPUs to those development and test databases as well as production, and actively auditing those more powerful privileges (by which I mean you READ about any use of them when they happen, and not six months later).

Thursday, November 26, 2009

DBMS_METADATA_DIFF with 11gR2

A comment on Tom Kyte's blog alerted me to DBMS_METADATA_DIFF in 11gR2.
This allows you to generate a set of 'ALTER ...' scripts that bring two objects into line.

It is documented as being part of
DBMS_METADATA at the PSOUG (formerly Morgan's Library) reference. The latest Morgan's Library reference says it is part of 11.1.0.7 but undocumented for that release.

Say you have added a column, or some constraints to a development or test environment, this can be used to compare that table definition to the production one and give you the statements that need to be applied to production to bring the tables into line.


The bad news is that it
requires the Change Management Option.

Hey Oracle. Can you please STOP installing stuff by default that you can't use without extra payment. Or at least don't create the public synonyms and grants that make it trivial to use.


To use the PL/SQL Profiler, you need to create some SYS tables. There's a script to do this but it has to be manually run by the DBA. Can't they do something similar with the extra cost options ?


Anyway, for the financially challenged, remember
that "Personal Edition includes all of the components that are included with Enterprise Edition, as well as all of the options that are available with Enterprise Edition", and this functionality works with database links. So for a relatively small outlay a DBA can get have Personal Edition sitting on a Windows box and use this handy addition.

Since this is first documented with 11gR2 you may want to until the Windows port of 11gR2 comes out for Personal Edition. If you want to risk it with 11gR1, you'd still need the first patchset, so the OTN version won't do.


09:41:05 GARY@db11gr2 > desc scott.gm_x;
Name Null? Type
--------------- -------- ----------------
1 ID NUMBER
2 VALUE VARCHAR2(10)

09:41:10 GARY@db11gr2 > desc gary.gm_x;
Name Null? Type
------------- -------- ----------------
1 ID NUMBER

09:41:14 GARY@db11gr2 >
select DBMS_METADATA_DIFF.COMPARE_ALTER('TABLE','GM_X','GM_X','SCOTT','GARY') from dual;
DBMS_METADATA_DIFF.COMPARE_ALTER('TABLE','GM_X','GM_X','SCOTT','GARY')
----------------------------------------------------------------------
ALTER TABLE "SCOTT"."GM_X" DROP ("VALUE")

Thursday, November 19, 2009

The amazing disappearing row trick

This curiosity was prompted by a question on stackoverflow
The code inserts a row into a table and then raises an exception.
Despite the exception being caught by an exception handler, and with no explicit rollback, the inserted row has vanished.


drop table dummy;

create table dummy (id number);

DECLARE
v_num NUMBER;
begin
begin
execute immediate
'declare
v_num number(2);
begin
insert into dummy values (1);
dbms_output.put_line(SQL%ROWCOUNT);
v_num := 100;
end;';
exception
when value_error then null;
end;
select count(*) into v_num from dummy;
dbms_output.put_line(v_num);
end;
/


The simple answer is that EXECUTE IMMEDIATE executes an SQL statement (which may be an anonymous PL/SQL block), and that statement will either succeed or fail. If it errors, atomicity states that changes made as part of that statement must be rolled back.

Without the EXECUTE IMMEDIATE, the entirety of the code is a single statement and there is no need for a rollback, since the statement succeeds due to the exception handler.


DECLARE
v_num NUMBER;
begin
begin
declare
v_num number(2);
begin
insert into dummy values (1);
dbms_output.put_line('X:'||SQL%ROWCOUNT);
v_num := 100;
end;
exception
when value_error then null;
end;
select count(*) into v_num from dummy;
dbms_output.put_line('Y:'||v_num);
--
end;
/


PS. You get the same result using EXECUTE IMMEDIATE or DBMS_SQL.

Thursday, November 05, 2009

A quick example of proxy authentication

It is fairly typical to have a bunch of developers working in a single schema. Often you want to keep track of which person did what in that schema. [Ideally, you want to know why, but reading minds is beyond the scope of this article.] One way to improve the traceability of activity in the schema is to give everyone a user account with their own password and then allow those development users to connect as a proxy to the schema user. Since everyone connects to the single schema using their own individual username/password, no-one needs to know the schema password and the DBA can even set it to some random jumble of 20 to 30 characters to prevent guessing.

The proxy user doesn't naturally appear in the V$ views (as far as I can tell), but can be derived from SYS_CONTEXT, and therefore used in a LOGON trigger to set CLIENT_INFO which is visible, or you could trace DDLs with AUDIT or a trigger and store the value there.


C:\>sqlplus gary/gary@xe
SQL*Plus: Release 10.2.0.1.0 - Production

SQL> create user schema1 identified by ***** default tablespace users;
User created.
SQL> alter user schema1 quota unlimited on users;
User altered.
SQL> create table schema1.test (id number);
Table created.
SQL> create user devuser identified by dangermouse default tablespace users;
User created.
SQL> alter user schema1 grant connect through devuser;
User altered.
SQL> grant create session to schema1;
Grant succeeded.
SQL> conn devuser[schema1]/dangermouse@xe
Connected.
SQL> select user from dual;
USER
------------------------------
SCHEMA1
SQL> select sys_context('userenv','proxy_user') from dual;
SYS_CONTEXT('USERENV','PROXY_USER')
--------------------------------------------------------------------------------
DEVUSER


With a JDBC client, such as SQuirrel, I simply put the username as DEVUSER[SCHEMA1] and use the DEVUSER password.

Friday, September 04, 2009

11gR2 nuggets

Haven't installed 11gR2 yet, but found a couple of nuggets in the documentation.

Firstly, WAIT_ON_PENDING_DML in DBMS_UTILITY.
Image this situation. At 11:59, transaction A inserts a record into a table with a CREATED_ON of SYSDATE, but is not yet committed. At 12:00 a batch job kicks off to pick out all the new records, using the CREATED_ON date. Since the one from transaction A isn't committed yet, it is missed by the batch job. If the batch job says "I picked up everything dated before 12:00" it will miss it next time too.
Previously, you might work around this with v$transaction or queuing. A long time ago I even used a LOCK TABLE IN EXCLUSIVE MODE. It was immediately released after it had been obtained but still provided a potential for an issue (though it isn't so dangerous if you add a WAIT 5 on the end). Now you can simply tell your transaction to wait until any other transaction on the table that started before 12:00 has completed.

Secondly, having the option of an EXIT in SQL*Plus default to quitting with a ROLLBACK rather than a COMMIT.
It is not one I'd thought was an issue, but it came up on Stackoverflow at the same time. Coincidence, I guess. Anyway, I'd expect Instant Clients for 11gR2 to come out fairly soon, so this can be put into production a lot quicker. Be warned though, a DISCONNECT or CONNECT will (apparently) still do an implicit commit. That all seems odd to me, as mentally I'd expect an EXIT to do a disconnect from the database session then exit the client (and a CONNECT to do a disconnect from one session and then a connect to a new one) and that any COMMIT/ROLLBACK option would relate to the disconnect option.

Thirdly, hints.
I'm not sure I like this but IGNORE_ROW_ON_DUPKEY_INDEX allows the statement to silently ignore ORA-0001 errors. The row won't be inserted, but the SQL statement won't fail either. Not sure if AFTER ROW triggers will fire or not. Personally I prefer the thought of the error logging clause for this sort of functionality. I just don't like hints which change what the statement does at a logical level.

One hint I want to experiment with is APPEND_VALUES, which is like the APPEND hint, but for FORALL...INSERT in PL/SQL. Not sure how often you would have an array in memory large enough to benefit from this. Also, once you've done a direct-path insert, the table wants a commit/rollback before you can do another. If APPEND_VALUES is the same, you'd need to introduce a commit if you want to use this for looping big chunks of data.

Friday, August 07, 2009

Interlude - Cheap(er) Spatial

In the previous post, I mentioned the cost of the Spatial option. There is a way to both decrease the cost and increase performance of the geocoding. I thought that would grab your attention.

The secret to improving performance is too precompute the answers. That is, work out the geocode for every street and cross street. I'll cover that in the coming posts.

The secret to reducing cost is Oracle Personal Edition. In the licensing guide, "Personal Edition includes all of the components that are included with Enterprise Edition, as well as all of the options that are available with Enterprise Edition". This is distinct from the Enterprise Edition wording which states that it "can be further enhanced with the purchase of the options". My emphasis. So you get Spatial free with Personal Edition.

Personal Edition is normally $460 according to the Oracle Store. That's for a perpetual license though. If you only wanted to do a one-off job, or just reduce your startup costs, it has the option of a one year license for $92 (though support would double that). And, for reasons I can't even guess at, Personal Edition only runs on Windows so you may have to add a Windows license to that.

Of course, with Personal Edition you do have to be very careful that only one person uses it, so this is best suited to an environment with one off processing, or where a bunch of data can be imported, processed and extracted as part of a daily or weekly batch by the license holder.

Wednesday, August 05, 2009

I can see my house ... Part 3 - Sax and the Cities

In the previous post in the series, I mentioned trying to work with the raw OSM file from OpenStreetMap. If you could do this, it would remove the 'dependency' on the extracts from CloudMade.

Whinge

But it isn't easy. The OSM format is XML. But that isn't my problem with it. Yes, XML is verbose and that makes the files big (about 1GB for Australia). I actually think XML is appropriate for this sort of data interchange requirement, as people will be interested in various slices of it and new bits will need adding on as time goes on.

No, what I object to is that practically everything that isn't a reference to a longitude/latitude is a key/value 'tag' element. While I don't like it, I do understand the reason for it given in the FAQ

"Imposing too many rules on them would reduce mapping from a fun and creative activity to a dull and boring data collection task. We give our data away for free, and if our users need more structure, they can convert our data to match their needs."

At this stage in the project, I agree that data volume outweighs data structure. However I suspect (and hope) there will come a time when the project matures to point where more structure is imposed.

Structure, what structure ?

Currently, the OSM is mostly a bunch of nodes/points (with a latitude, longitude and multiple tags) and ways (an ordered set of points, and multiple tags). Some 'ways' are tagged as highways , some are coastlines, some are rivers. A single 'way' can be multiple things (eg a road or river may serve as a boundary between regions or countries, or a 'way' may serve as both road and tramway). There are also relations, which can group nodes and ways (and have their own tags).

Give this problem to a database designer, and they would have looked at a map and think "Hmmm. We need to differentiate between roads, rivers, train lines etc. And most roads have names, maybe speed limits...". They'd come up with a data model, But rather than store these as XML attributes (or sub-elements), they have been relegated to catch-all key/value pairs.

That means, rather than being documented as part of the XML schema (XSD or DTD), it is all on a wiki. And there's no automated way to validate conformance.

But enough whinging. If you want to use the OSM, here's a few suggestions based on my attempts.

How to get the OSM

You can download the entire planet's OSM file from one of the mirrors listed, but our planet is a big place. Personally, I'm just interested in Australia at the moment, and I can get a download of that from OSM Australia.

If, as is likely, you have a different area of interest, you can download the whole planet and use a tidy java app called osmosis, to extract a square, rectangle or other shape containing your region of interest.

They also do daily deltas for the planet.osm. Again, you'd chop out the rectangle of interest and apply that. Australia is surrounded by sea and it is pretty easy to come up with a rectange that includes Australia and nothing else. You can do other shapes too, but I like to keep it simple.

java -jar osmosis.jar --read-xml file="/osm/australia.osm" --bounding-box top="-33.50" bottom="-34.12" left="150.54" right="151.45" --write-xml file="greater_sydney.osm"

The osmosis instructions contains some other filters. Alas there isn't one to extract only highways, but it is open source and I've put in a request for that to the dev list.
How not to load the OSM

I tried to load part of Australia (the greater Sydney area - about 70MB) in one chunk to an Object-Relational XML schema and gave up. It worked for a smaller extract of 4 Mb though and I was using a single core laptop with 1GB of RAM. A bigger system and/or a smaller region and you may have better luck. The SQL for that is osm_schema.sql. You will have to amend the OSM file to put in the appropriate XML headers. Plus I had some character set issues which I resolved through uconv (simply deleting the characters it didn't recognise).

How to load the OSM

After hunting around, I found Marco's blog entry on SAXLoader. Thanks Marco. I've copied the source code from the messages on the thread and saved them on as separate files.
I didn't want to inflict JDeveloper on my little laptop, so I compiled them through the command line.
Your mileage may vary, but I set the following classpath, and compiled as follows:

export CLASSPATH=$CLASSPATH:.:$ORACLE_HOME/lib/xmlparserv2.jar:$ORACLE_HOME/lib/servlet.jar:/$ORACLE_HOME/jdbc/lib/ojdbc6.jar
javac -d *.java


As I said above, under the top level "osm" element, the XML has three different elements, for "node", "way" and "relation". This means three passes through the SaxLoader. I used three destination tables, all simply created like CREATE TABLE OSM_XML_NODE OF "SYS"."XMLTYPE". The easiest way to script the run is with three separate config XML files and call the java for each type.

java -Xmx256M -Dcom.oracle.st.xmldb.pm.ConnectionParameters=/osm/SAXLoader/node.xml com.oracle.st.xmldb.pm.saxLoader.SaxProcessor > node.log

java -Xmx256M -Dcom.oracle.st.xmldb.pm.ConnectionParameters=/osm/SAXLoader/rel.xml com.oracle.st.xmldb.pm.saxLoader.SaxProcessor > rel.log


java -Xmx256M -Dcom.oracle.st.xmldb.pm.ConnectionParameters=/osm/SAXLoader/way.xml com.oracle.st.xmldb.pm.saxLoader.SaxProcessor > way.log


For me, the 'node' took several hours, 'way' was a couple of hours and 'relation' about ten minutes. Again, your mileage will vary depending on your hardware and input file size. The smaller daily delta files should breeze through.

How to extract relational data from the OSM XML elements

The script xml_to_rel.sql is my first effort at this.

There were almost eight million nodes and about 1.5 million tags for Australia (but 1.2 million tags were 'created by', and I don't care about those). From the nodes I extracted the longitude, latitude, user and timestamp attributes. The tags are populated to a child table. Again, this operation takes several hours.

I extracted the details for way in a similar manner (450,000 entries), except there are two child tables. The first is the relationship between WAY and NODE (8 million), and the second the tags for the WAY (1.5 million). I don't bother with the visible attribute for WAY either. The osm_way_node extraction took about seven hours. I believe a large chunk of that was the handful of XMLs over 100kB in size, with tens of thousands of nodes (and I think is the largest of those is the Tasmanian coastline). You could try filtering out based on length(s.SYS_NC_ROWINFO$.getclobval()) if that's a problem for you.

Finally, I 'promoted' the name and highway to attributes of the way.

I haven't gone as far as pulling the longitude/latitudes into MDSYS.SDO_GEOMETRY datatypes, either on the nodes or ways. That would bring the data into line with that derived in the previous post, removing the reliance on the Cloudmade downloads.