Re: [PHPLIB] Re: Am I doing right? From: Kristian Koehntopp (kris <email protected>)
Date: 11/14/99

On Sat, Nov 13, 1999 at 05:02:17PM +0100, Björn Schotte wrote:
> When someone logged in and edits a new appointment,
> the $auth->auth["uname"] is inserted into the table's
> uname. Is that enough to do or should I insert
> $auth->auth["uid"] ? Or both?

[ A longer sermon about 1:1 relation ships follows. Skip to the
  next comment if you are not interested -- KK ]

The auth_user table is a strange table, because it has two
unique columns: username and uid. Only the uid column is being
used as a PRIMARY KEY, while the username column is only
labelled UNIQUE KEY.

Effectively, the auth_user table models a 1:1 relationship
between the active_sessions and auth_user for those rows of
active_sessions where p_name is the name of a User subclass. In
an ER model, 1:1 relationships are seldom used and should be
carefully scrutinized. They have uses, though: They are used to

- model class->subclass relationships,
- store large, optional row components,
- split tables for performance,
- implement key translations.

All these are in fact facets of the same problem.

Class->sublass relation:

Suppose your have a class A having several slots a, b and c. You
also have subclasses U and V extending A, which U having the
additional slots p and q and V having s and t. You could then
have a table storing these objects

A_table:

class | id | a | b | c | p | q | s | t
------+----+---+---+---+---+---+---+---
 U | 12 | x | x | x | x | x |\N |\N
 V | 14 | x | x | x |\N |\N | x | x

Here the PRIMARY KEY is the id column and the class column
denotes the type of object stored in a row. Depending on the
content of the class column, either s and t or p and q can be
NULL values. This is a difficult constraint (In MySQL you cannot
even formulate it).

Alternatively you could create multiple tables:

A_table:

class | id | a | b | c
------+----+---+---+---
 U | 12 | x | x | x
 V | 14 | x | x | x

U_table:

class | id | p | q
------+----+---+---
 U | 12 | x | x

V_table:

class | id | s | t
------+----+---+---
 V | 14 | x | x

Here you can properly formulate the constraint, but you run into
trouble when you want to access objects of the types U and V
simultaneously in the same query, because you would need
subselects in a UNION statement to recreate the original table
(again, MySQL cannot even formulate this query).

Store large optional row components:

This is in fact the same problem as above, only reduced to A and
U. The A_table here hold the nonoptional base components and the
U_table holds all the additional components of the class U
members. Depending on the width of the U_table and quotient
|U|/|A| (number of U objects to number of A objecs) this can
save a lot or be completely useless, only adding complexity.

Split tables for performance:

Some implementations of RDBMs have limitations regarding the
total row width or the number of columns in a row so that you
have to split a class A into artificial subclasses A_1 and A_2,
which are interdependent and form a nonoptional 1:1
relationship.

Also, sometimes performance degrades if you do not do the split,
for example when working with BLOBs which are part of a heavily
used table or when working with BLOBs which are not the final
columns in a table definitions (This is the case in MySQL and in
many other RDBMs). Some RDBMs manage to create such a split for
BLOBs internally and completely hide this fact to the outside,
for performance reasons.

Implement key translations

This is again the "A is a superclass of U" case, sometimes with
all instances of A being members of class U.

Key translations happen where you connect two indepently grown
models to each other and both models need to relate their
objects to each other. For example, an ISP has customer accounts
with account numbers in his business systems, where offers and
invoices are written and where all bookkeeping is done. He also
as customer accounts with account identifiers in his production
systems where system configurations are generated, usage is
metered and aggregate metering records are created.

In a small shop, both systems may have grown independently and
were only manually connected. Key translations are done in the
heads of the people ("The login kriski reports a problem, can I
have the phone number of their technical contact?" "Ah, let me
see, kriski is customer number 3482. That would be 555-3432
then.") or on paper, but generally offline. When you
interconnect such systems, you cannot or do not want to migrate
all primary keys in one system to the other system. Instead
you'd implement a translation table where the keys of one system
are converted into the keys of the other system so that you
could select phone numbers via the translation table using an
account name.

The auth_user table is just that, a key translation table. The
Auth and User classes work with p_user_id varchar(32) fields,
which are the PRIMARY KEY of the auth_user table and are the
primary and unchanging user identifications in these subsystems.
You application will never work with Auth and User (these are
the "A" classes), but of Example_Auth and Example_User, which
are subclasses of Auth and User respectively (these are the "U"
classes).

You'd never present the p_user_id to an end user. Instead your
end users identify themselves using some other unique
identifier. In many cases this is a username, but some
applications may have other requirements, for example a
given_name and a family_name comobination or a mail address or a
customer number. Auth and User need to implement user
identities, but must not make any assumptions about the form of
the unique user identifiers, as they cannot know in advance what
the requirements of the application may be. So they create
artificial internal identifiers, the p_user_id.

The auth_user table is then being used to translate the PRIMARY
KEY of your applications, whatever form it may have, into the
PRIMARY KEY used by the Auth and User classes, which has the
form of a p_user_id.

[ End sermon on 1:1 relationships -- KK ]

As auth_user is a key translation table, the $auth->auth["uid"]
alias auth_user.p_user_id and $auth->auth["uname"] alias
auth_user.p_username can be used interchangeably. It is a design
decision what to use and there are reasons for either choice.

If you are building on the green and there are no legacy
systems, I suggest that you always use the p_user_id to
reference users internally in your application and that you
always translate these identifiers into p_usernames when you
present the user with user identifiers. That way you keep your
code independent from any particular representation of user
idenfiers and may change them or port the code to other
applications doing the same.

If you have legacy systems to which you have to talk, the
situation may be more difficult. In fact, it may become
arbitrarily complex as the legacy systems may have another,
third form of user identifier or the translation process may be
multistaged. Keep your system boundaries well defined and know
when your are crossing them and why. Don't let identifiers from
one system drift into the other system.

As for your appointment manager: I would create that system
using always and only the p_user_id internally and create a
special lookup function which converts the p_user_id into a
display name. That way your appointment manager does not depend
on any particular form or order of display names. It may use
usernames, "given family", "family given" or email addresses as
display names in different instances. Never store user names
anywhere outside the auth_user table, but always use the
p_user_id instead.

Kristian

auth_user

-
PHP3 Base Library Mailing List. Send messages to <phplib <email protected>>.
To unsubscribe, send "unsubscribe" to <phplib-request <email protected>> in
the body, not the subject, of your message.