PHP coding guidelines

Acknowledgments

These guidelines were not originally intended for public consumption. They were originally collected and compiled in the early 2000s from a number of sources on the web, and simply intended for the use of the developmment team I led at the time. It was only years later, after these guidelines had been expanded and modified, and I had left that project, that I thought to share these guidelines in the hopes that others might find them useful.

Unfortunately, since I never bothered to keep track of where the bits and pieces of the original guidelines were taken, it may appear that I am unjustly taking credit for the work and guidance of others. This is not my intention, and if you have knowledge that any portion of these guidelines were originally taken from elsewhere, I would be grateful if you would let me know so that I may thank them and give them proper credit.

My thanks and apologies are extended to the following sources, which inspired these guidelines -- in some cases, the original phrasing still remains here, virtually unchanged from the form in which it first appeared. No offense or infringement is intended.

Using these guidelines

If you want to make a local copy of these guidelines and use them as your own you are perfectly free to do so. If you find any errors please email me the changes so I can merge them in. Recent changes.

PHP 4 vs. PHP 5

These guidelines were written for PHP 4, and have not been updated for changes made in PHP 5. As such, if you are using PHP 5, exercise some care to make sure that these guidelinees are relevant to your project. You should not adopt these without reading and understanding them (that's true even if you are still using PHP 4).

Contents


Introduction

This PHP coding guidelines document contains the standard conventions that the author follows and recommends that others follow. It covers filenames, file organization, indentation, comments, declarations, statements, white space, naming conventions, programming practices, and includes code examples for most of the guidelines presented. The majority of these guidelines are based on the Sun Java guidelines and the PEAR guidelines.

Why have code conventions?

Code conventions are important to programmers for a number of reasons:

It doesn't matter what your guidelines are, so long as everyone understands and sticks to them. These PHP coding guidelines are based largely on Sun's Code Conventions for the Java Programming Language. Deviations from the Sun code conventions are largely a result of the interaction of PHP with other web server applications, primarily databases.

Enforcing the code conventions

If you are the lead developer on a project, you will encounter programmers who do not like your guidelines and refuse to abide by them. Coping with this is one of the small obstacles that leading a development inevitably entails. You have two options. You might decide that the value of the errant programmer's contributions outweighs the inconsistency introduced by their refusal to abide by your coding conventions. However, you are not doing yourself or your team any favors by permitting one programmer to be "above the rules". This will only lead to disruptions in the team down the road. The best way to address this issue is two-fold. First, try to achieve consensus among the team members. Not everyone will agree on everything, but it's better to minimize disagreements when possible. Second, once a code convention has been established, do not accept code into the project unless it adheres to those guidelines. This will cause some initial friction, but the long term cost of inconsistent coding conventions is much greater than the small amount of effort it will new team members to adapt to those conventions. At the end of the day, professional programmers know that they need to be adaptable and to abide by the rules of the current project.


Editor settings

Indentation (tabs vs. spaces)

Justification

Example

function func()
{
	if (something bad)
	{
		if (another thing bad)
		{
			while (more input)
			{
				...
			}
		}
	}
}

Line endings

The three major operating systems (Unix, Windows, and Mac OS) use different ways to represent the end of a line. Unix systems use the newline character (\n), Mac systems use a carriage return (\r), and Windows systems are terribly wasteful in that they use a carriage return followed by a line feed (\r\n). Ensure that your editor is saving files in the UNIX format. This means lines are terminated with a linefeed (\n), not with a carriage return/linefeed (\r\n) as they are on Win32, or a carriage return (\r) as they are on the Mac. Any decent editor (such as Notepad++) is able to do this, but it might not be the default. If you develop on Windows (and many people do), either set up your editor to save files in Unix format or run a utility that converts between the two file formats.

Terminate the last line of the file, unless it's the only line.


Names

Make names fit

Names are the heart of programming. In the past people believed knowing someone's true name gave them magical power over that person. If you can think up the true name for something, you give yourself and the people coming after power over the code.

A name is the result of a long deep thought process about the ecology it lives in. Only a programmer who understands the system as a whole can create a name that "fits" with the system. If the name is appropriate everything fits together naturally, relationships are clear, meaning is derivable, and reasoning from common human expectations works as expected.

If you find all your names could be Thing and DoIt then you should probably revisit your design.

Hungarian notation

Hungarian notation is the practice of embedding metadata about a variable into the variable's name. For example, a variable holding a long integer might be prepended with "l" (lower case L), while an unsigned 32-bit integer might be prepended with "u32". There are several problems with using this type of notation. First, PHP is an untyped language, so "type" is not relevant. Second, Hungarian notation can quickly render a variable name into an unrecognizable mess (Wikipedia uses "a_crszkvc30LastNameCol" as an example: a constant reference argument, holding the contents of a database column LastName of type varchar(30) which is part of the table's primary key).

Having variables which are not human-readable will lead to errors when the code is revised or maintained. Do not use Hungarian notation.

Justification

Abbreviations

Use whole words -- avoid acronyms and abbreviations unless the acronym is much more widely used than the long form, such as URL and HTML. When confronted with a situation where you could use an all upper case abbreviation instead use an initial upper case letter followed by all lower case letters.

Justification

Example

class FluidOz // NOT FluidOZ
class GetHtmlStatistic // NOT GetHTMLStatistic

Class names

Class names should be nouns, in mixed case with the first ketter of each internal word capitalized. Try to keep your class names simple and descriptive.

Example

class NameOneTwo
class Name

Class library names

Now that name spaces are becoming more widely implemented, name spaces should be used to prevent class name conflicts among libraries from different vendors and groups. When not using name spaces, it's common to prevent class name clashes by prefixing class names with a unique string. Two characters is often sufficient, but a longer length is fine. For example, the xTS project used "Xts" (don't let the shift in capitalization bother you: see Class names).

Example

Jo Johanssen's data structure library could use JJ as a prefix, so classes could be:

class JjNameOneTwo
{
}

Method and function names

Methods should be verbs, in mixed case with the first letter lowercase, with the first letter of each internal word capitalized. Most methods and functions performs actions, so the name should make clear what it does, as consisely as possible: checkForErrors() instead of errorCheck(), dumpDataToFile() instead of dumpDataFileToDiskAfterAParticularlyHorrendousCrash(). This will also make functions and data objects more distinguishable.

Example

class JjNameOneTwo
{
	function setSomething()
	{
		...
	}
	function handleError()
	{
		...
	}
}

Variable names

Variable names should be all lowercase, with words separated by underscores. For example, $current_user is correct, but $currentuser, $currentUser and $CurrentUser are not. Variable names should be short yet meaningful. The choice of a variable name should indicate to the casual observer the intent of its use. One-character variable names should be avoided except for temporary variables and loop indices. Common names for temporary variables are i, j, k, m, and n for integers; c, d, e for strings.

Justification

Example

function HandleError($error_number)
{
	$error = new OsError;
	$time_of_error = $error->getTimeOfError();
	$error_processor = $error->getErrorProcessor($error_number);
}

Example

$myarr['foo_bar'] = 'Hello';
print "$myarr[foo_bar] world"; // will output: Hello world
$myarr['foo-bar'] = 'Hello';
print "$myarr[foo-bar] world"; // warning message

Method argument names

Since function arguments are just variables used in a specific context, they should follow the same guidelines as variable names. It should be possible to tell the purpose of a method just by looking at the first line, e.g. getUserData($username). By examination, you can make a good guess that this function gets the user data of a user with the username passed in the $username argument. Method arguments should be separated by spaces, both when the function is defined and when it is called. However, there should not be any spaces between the arguments and the opening/closing parentheses.

Example

class NameOneTwo
{
	function startYourEngines(&$some_engine, &$another_engine)
	{
		$this->some_engine = $some_engine;
		$this->another_engine = $another_engine;
	}
	var $some_engine;
	var $another_engine;
}

Example

get_user_data( $username, $password ); // NOT correct: spaces next to parentheses
get_user_data($username,$password); // NOT correct: no spaces between arguments
get_user_data($a, $b); // ambiguous: what do variables $a and $b hold?
get_user_data($username, $password); // correct

Array elements

Since array elements are just variables used in a specific context, they should follow the same guidelines as variable names.

Justification

Example

$myarr['foo_bar'] = 'Hello';
$element_name = 'foo_bar';
print "$myarr[foo_bar] world"; // will output: Hello world
print "$myarr[$element_name] world"; // will output: Hello world
print "$myarr['$element_name'] world"; // parse error
print "$myarr["$element_name"] world"; // parse error

Constant (define) names

Constants should be all uppercase with words separated by underscores ('_').

Justification

It's tradition for global constants to named this way. You must be careful to not conflict with other predefined globals.

This capitalization method for constant values (particularly for language-specific constants) provides the greatest amount of flexibility.

Example

define("A_CONSTANT", "Hello world!");

Contant values

Example

define("TITLE_CONSTANT", "User profile administration");
define("ERROR_CONSTANT", "This is an error message.");

Formatting

PHP code tags

PHP Tags are used for delimit PHP from HTML in a file. There are several ways to do this. <?php ... ?>, <? ... ?>, <script language="php"> ... </script>, <% ... %>, and <?=$name?>. Some of these may be turned off in your PHP settings. Always use <?php ... ?>.

Justification

Example

<?php print "Hello world"; ?> // correct
<? print "Hello world"; ?> // NOT correct
<script language="php"> print "Hello world"; </script> // NOT correct
<% print "Hello world"; %> // NOT correct
<?=$street?> // NOT correct

One statement per line

There should be only one statement per line.

Braces {...}

This is a subject of great controversy, but we will use a policy that can be summed up simply: a brace appears alone on a line, left-aligned with its associated keyword. A closing brace is always in the same column as the corresponding opening brace. Additonally, if braces can be used, they must be used. Leaving out braces makes code harder to maintain in the future, and can also cause bugs that are very difficult to track down.

Example

if ($condition)
{
	while ($condition)
	{
		...
	}
}

// NOT...

if ($condition)
	while ($condition) { // NOT correct
		...
	}

Justification

Example

if ($very_long_condition && $second_very_long_condition)
{
	...
}
elseif (...)
{
	...
}

There are those who prefer to leave curly braces off of very simple if statements, and who are satisfied to convey the conditional nature of the statement through indentation. The problem with this is twofold: first, indentation can't be trusted to remain in place. Code changes. A few lines of code may be copied from one place and pasted into another, and re-indented to fit with its new home. Or perhaps it might be added to a nested control structure, or an additional control structure might be added after the current un-braced if. All of this is simply an error waiting to happen.

Unless you are the only person who works on your code, and you never touch it a second time after you type it, leaving the curly braces out will result in errors.

Parentheses (...)

As a general rule, parentheses should always be used where the possibility of ambiguity exists. Additionally, there should not be any added spacer between a parenthesis and its contents. Control statements, such as if, for, and while, should have one space after the keyword prior to the opening parenthesis. Methods should follow the rules laid out already, i.e. no space between the method name and the opening parenthesis, and no space between the parentheses and the arguments, but one space between each argument; generally speaking, this is the only time an opening parenthesis will not be preceded by a space.

Justification

Example

if ((condition1 || condition2) || (value1 < value2))
{
	...
}
while (condition)
{
	...
}
strcmp($s, $s1);
return 1;

Alignment of declaration blocks

A block of declarations may be aligned, but it does not have to be.

Justification

Do not waste a lot of time aligning block of variable assignments: it can easily become a neverending task.

Example

var	   $date
var&	  $name
$date	 = 0;
$name	 = 0;

Operators and tokens

There are three types of operators. Firstly, there is the unary operator which operates on only one value; for example '!' (the negation operator) or '++' (the increment operator). The second group are termed binary operators; this group contains most of the operators that PHP supports, such as '+', '-', and so on. A list of binary operators follows below in the section on operator precedence. The third group is the ternary operator: '... ? ... : ...', which is an abbreviated form of if... then... else. Surrounding the operands in ternary expressions with parentheses is a very good idea.

There should not be space between a unary operator and the thing on which is operates.

There should always be one space on either side of a token or binary operator. The only exceptions are commas and semicolons, which should have one space after, but none before (just like in English).

if ... then ... else

Always use braces.

Example

if (condition)
{
	// comment
	...
}
else if (condition)
{
	// comment
	...
}
else
{
	// comment
	...
}

If you have else if statements then it is usually a good idea to always have an else block for finding unhandled cases. Maybe put a log message in the else even if there is no corrective action taken.

switch

Example

switch (...)
{
	case 1:
		...
		// FALL THROUGH
	case 2:
		$v = get_week_number();
		...
		break;
	default:
}

continue, break, and ... ? ... : ...

continue and break

Continue and break are really disguised gotos so they are covered here.

Continue and break like goto should be used sparingly as they are magic in code. With a simple spell the reader is beamed to gods know where for some usually undocumented reason.

The two main problems with continue are:

Consider the following example where both problems occur:

while (TRUE)
{
	...
	// A lot of code
	...
	if (/* some condition */)
	{
		continue;
	}
	...
	// A lot of code
	...
	if ( $i++ > STOP_VALUE) break; // WRONG: braces are not optional
}

Note: "A lot of code" is necessary in order that the problem cannot be caught easily by the programmer.

From the above example, a further rule may be given: mixing continue with break in the same loop is a sure way to disaster.

This is also a good example of why braces are not optional.

... ?... :...

The trouble with ternary operators is that people usually try and stuff too much code into them. It is better to avoid ternary operators unless all of the operands are very short and very simple. When in doubt, use if... then... else.


Documentation

Comments on comments

Comments should tell a story

Consider your comments a story describing the system. Expect your comments to be extracted by a robot and formed into a man page. Class comments are one part of the story, method signature comments are another part of the story, method arguments another part, and method implementation yet another part. All these parts should weave together and inform someone else at another point of time just exactly what you did and why.

Document decisions

Comments should document decisions. At every point where you had a choice of what to do place a comment describing which choice you made and why. Archeologists will find this the most useful information.

Use headers

Use a document extraction system like phpDocumentor.

Comment layout

Each part of the project has a specific comment layout.

Make gotchas explicit

Explicitly comment variables changed out of the normal control flow or other code likely to break during maintenance. Embedded keywords are used to point out issues and potential problems. Consider a robot will parse your comments looking for keywords, stripping them out, and making a report so people can make a special effort where needed.

Gotcha keywords

Gotcha formatting

Example

// :TODO: tmh 1996-08-10: possible performance problem
// We should really use a hash table here but for now we'll
// use a linear search.
// :KLUDGE: tmh 1996-08-10: possible unsafe type cast
// We need a cast here to recover the derived type. It should
// probably use a virtual method or template.

See also

See Interface and Implementation Documentation for more details on how documentation should be laid out.

Interface and implementation documentation

There are two main audiences for documentation:

With a little forethought we can extract both types of documentation directly from source code.

Class users

Class users need class interface information which when structured correctly can be extracted directly from a header file. When filling out the header comment blocks for a class, only include information needed by programmers who use the class. Don't delve into algorithm implementation details unless the details are needed by a user of the class. Consider comments in a header file a man page in waiting.

Class implementors

Class implementors require in-depth knowledge of how a class is implemented. This comment type is found in the source file(s) implementing a class. Don't worry about interface issues. Header comment blocks in a source file should cover algorithm issues and other design decisions. Comment blocks within a method's implementation should explain even more.

Copyright notice format

"Copyright ©" is redundant. The circle-c symbol is intended to be an optional replacement for the word "copyright", and "(c)", i.e. c-in-parentheses, has never been given legal force.

The proper format in ASCII is "Copyright <year(s)> <copyright owner>", e.g. "Copyright 2002 Merciless Hangover, LLC".


Server configuration

This section contains some guidelines for PHP/Apache configuration.

HTTP_*_VARS

HTTP_*_VARS are either enabled or disabled. When enabled all variables must be accessed through $HTTP_*_VARS[key]. When disabled all variables can be accessed by the key name.

Justification

PHP file extensions

There are lots of different extension variants on PHP files (.html, .php, .php3, .php4, .phtml, .inc, .class, etc.).

Justification

Example

filename.inc // NOT correct
filename.class // NOT correct
filename.php // correct
filename.inc.php // correct
filename.class.php // correct

Miscellaneous

This section contains some miscellaneous do's and don'ts.

Use if (0) to comment out very large code blocks

Sometimes very large blocks of code need to be commented out for testing. The easiest way to do this is with an if (0) block:

function example()
{
	great looking code
	if (0)
	{
		lots of code
	}
	more code
}

You can't use /**/ style comments because comments can't contain comments, and surely a large block of your code will contain a comment, won't it?


Classes

Short methods

Methods should limit themselves to a single page of code.

Justification

Do not do real work in object constructors

Do not do any real work in an object's constructor. Inside a constructor initialize variables only and/or do only actions that can't fail.

Create an Open() method for an object which completes construction. Open() should be called after object instantiation.

Justification

Example

class Device
{
	function Device()
	{
		/* initialize and other stuff */
	}
	function Open()
	{
		return FAIL;
	}
}

$dev = new Device;

if (FAIL == $dev->Open())
{
	exit(1);
}

Make functions reentrant

Functions should not keep static variables that prevent a function from being reentrant.

Error return check policy

Document null statements

Always document a null body for a for or while statement so that it is clear that the null body is intentional and not missing code.

while ($dest++ = $src++)
{
	; // VOID
}

Do not default if test to non-zero

Do not default the test for non-zero, i.e.

if (FAIL != f())

is better than

if (f())

even though FAIL may have the value 0 which PHP considers to be false. An explicit test will help you out later when somebody decides that a failure return should be -1 instead of 0. Explicit comparison should be used even if the comparison value will never change; e.g., if (!($bufsize % strlen($str))) should be written instead as if (0 == ($bufsize % strlen($str))) to reflect the numeric (not boolean) nature of the test. A frequent trouble spot is using strcmp to test for string equality, where the result should never ever be defaulted.

The non-zero test is often defaulted for predicates and other functions or expressions which meet the following restrictions:

Boolean types

Do not check a boolean value for equality with 1 (TRUE, YES, etc.); instead test for inequality with 0 (FALSE, NO, etc.). Most functions are guaranteed to return 0 if false, but only non-zero if true. Thus,

if (TRUE == func())
{
...

must be written

if (FALSE != func())
{
...

Usually avoid embedded assignments

There is a time and a place for embedded assignment statements. In some constructs there is no better way to accomplish the results without making the code bulkier and less readable.

while ($a != ($c = getchar()))
{
	process the character
}

The ++ and -- operators count as assignment statements. So, for many purposes, do functions with side effects. Using embedded assignment statements to improve run-time performance is also possible. However, one should consider the tradeoff between increased speed and decreased maintainability that results when embedded assignments are used in artificial places. For example,

$a = $b + $c;
$d = $a + $r;

should not be replaced by

$d = ($a = $b + $c) + $r;

even though the latter may save one cycle. In the long run the time difference between the two will decrease as the optimizer gains maturity, while the difference in ease of maintenance will increase as the human memory of what's going on in the latter piece of code begins to fade.


Complexity management

Layering

Layering is the primary technique for reducing complexity in a system. A system should be divided into layers. Layers should communicate between adjacent layers using well defined interfaces. When a layer uses a non-adjacent layer then a layering violation has occurred.

A layering violation simply means we have dependency between layers that is not controlled by a well defined interface. When one of the layers changes, code could break. We don't want code to break, so we want layers to work only with other adjacent layers.

Sometimes we need to jump layers for performance reasons. This is fine, but we should know we are doing it and document it appropriately.

Open / Closed principle

The Open/Closed principle states a class must be open and closed where:

The Open/Closed principle is a pitch for stability. A system is extended by adding new code not by changing already working code. Programmers often don't feel comfortable changing old code because it works! This principle just gives you an academic sounding justification for your fears :-)

In practice the Open/Closed principle simply means making good use of our old friends abstraction and polymorphism. Abstraction to factor out common processes and ideas. Inheritance to create an interface that must be adhered to by derived classes.


Development process

Code reviews

Code reviews can be very useful. Unfortunately, they often degrade into nit picking sessions and endless arguments about silly things. They also tend to take a lot of people's time for a questionable payback.

First, code reviews are way too late to do much of anything useful. What needs reviewing are requirements and design. This is where you will get more bang for the buck.

Get all of the relevant people in a room. Lock them in. Go over the class design and requirements until the former is good and the latter is being met. Having all the relevant people in the room makes this process a deep fruitful one as questions can be immediately answered and issues immediately explored. Usually only a couple of such meetings are necessary.

If the above process is done well coding will take care of itself. If you find problems in the code review the best you can usually do is a rewrite after someone has sunk a ton of time and effort into making the code "work."

You will still want to do a code review, just do it offline. Have a couple people you trust read the code in question and simply make comments to the programmer. Then the programmer and reviewers can discuss issues and work them out. Email and quick pointed discussions work well. This approach meets the goals and doesn't take the time of 6 people to do it.

Create a source code control system early and not often

A common build system and source code control system should be put in place as early as possible in a project's lifecycle, preferably before anyone starts coding. Source code control is the structural glue binding a project together. If programmers can't easily use each other's products then you'll never be able to make a good reproducible build and people will piss away a lot of time. It's also hell converting rogue build environments to a standard system. But it seems the rite of passage for every project to build their own custom environment that never quite works right.

Some issues to keep in mind:

Sources

I recommend Subversion for a version control system.

Create a bug tracking system early and not often

The earlier people get used to using a bug tracking system the better. If you are 3/4 through a project and then install a bug tracking system it won't be used. You need to install a bug tracking system early so people will use it.

Programmers generally resist bug tracking, yet when used correctly it can really help a project:

Not sexy things, just good solid project improvements.

Source code control should be linked to the bug tracking system. During the part of a project where source is frozen before a release only checkins accompanied by a valid bug ID should be accepted. And when code is changed to fix a bug the bug ID should be included in the checkin comments.

Resources

I recommend Mantis for bug tracking. Mantis intregrates with Subversion.

Honor responsibilities

Responsibility for software modules is scoped. Modules are either the responsibility of a particular person or are common. Honor this division of responsibility. Don't go changing things that aren't your responsibility to change. Only mistakes and hard feelings will result.

Face it, if you don't own a piece of code you can't possibly be in a position to change it. There's too much context. Assumptions seemingly reasonable to you may be totally wrong. If you need a change simply ask the responsible person to change it. Or ask them if it is OK to make such-n-such a change. If they say OK then go ahead, otherwise holster your editor.

Every rule has exceptions. If it's 3 in the morning and you need to make a change to make a deliverable then you have to do it. If someone is on vacation and no one has been assigned their module then you have to do it. If you make changes in other people's code try and use the same style they have adopted.

Programmers need to mark with comments code that is particularly sensitive to change. If code in one area requires changes to code in an another area then say so. If changing data formats will cause conflicts with persistent stores or remote message sending then say so. If you are trying to minimize memory usage or achieve some other end then say so. Not everyone is as brilliant as you.

The worst sin is to flit through the system changing bits of code to match your coding style. If someone isn't coding to the standards then ask them or ask your manager to ask them to code to the standards. Use common courtesy.

Code with common responsibility should be treated with care. Resist making radical changes as the conflicts will be hard to resolve. Put comments in the file on how the file should be extended so everyone will follow the same rules. Try and use a common structure in all common files so people don't have to guess on where to find things and how to make changes. Checkin changes as soon as possible so conflicts don't build up.

As an aside, module responsibilities must also be assigned for bug tracking purposes.

No magic numbers

A magic number is a bare-naked number used in source code. It's magic because no-one has a clue what it means including the author inside 3 months. For example:

if (22 == $foo)
{
	start_thermo_nuclear_war();
}
else if (19 == $foo)
{
	refund_lotso_money();
}
else if (16 == $foo)
{
	infinite_loop();
}
else
{
	cry_cause_im_lost();
}

In the above example what do 22 and 19 mean? If there was a number change or the numbers were just plain wrong how would you know?

Heavy use of magic numbers marks a programmer as an amateur more than anything else. Such a programmer has never worked in a team environment or has had to maintain code or they would never do such a thing.

Instead of magic numbers use a real name that means something. You should use define(). For example:

define("PRESIDENT_WENT_CRAZY", "22");
define("WE_GOOFED", "19");
define("THEY_DIDNT_PAY", "16");

if (PRESIDENT_WENT_CRAZY == $foo)
{
	start_thermo_nuclear_war();
}
else if (WE_GOOFED == $foo)
{
	refund_lotso_money();
}
else if (THEY_DIDNT_PAY == $foo)
{
	infinite_loop();
}
else
{
	happy_days_i_know_why_im_here();
}

Now isn't that better?

Thin vs. fat class interfaces

How many methods should an object have? The right answer of course is just the right amount, we'll call this the Goldilocks level. But what is the Goldilocks level? It doesn't exist. You need to make the right judgment for your situation, which is really what programmers are for :-)

The two extremes are thin classes versus thick classes. Thin classes are minimalist classes. Thin classes have as few methods as possible. The expectation is users will derive their own class from the thin class adding any needed methods.

While thin classes may seem "clean" they really aren't. You can't do much with a thin class. Its main purpose is setting up a type. Since thin classes have so little functionality many programmers in a project will create derived classes with everyone adding basically the same methods. This leads to code duplication and maintenance problems which is part of the reason we use objects in the first place. The obvious solution is to push methods up to the base class. Push enough methods up to the base class and you get thick classes.

Thick classes have a lot of methods. If you can think of it a thick class will have it. Why is this a problem? It may not be. If the methods are directly related to the class then there's no real problem with the class containing them. The problem is people get lazy and start adding methods to a class that are related to the class in some willow wispy way, but would be better factored out into another class. Judgment comes into play again.

Thick classes have other problems. As classes get larger they may become harder to understand. They also become harder to debug as interactions become less predictable. And when a method is changed that you don't use or care about your code will still have to be retested, and rereleased.

Reusing your hard work and the hard work of others

Reuse across projects is almost impossible without a common framework in place. Objects conform to the services available to them. Different projects have different service environments making object reuse difficult.

Developing a common framework takes a lot of up front design effort. When this effort is not made, for whatever reasons, there are several techniques one can use to encourage reuse:

Don't be afraid of small libraries

One common enemy of reuse is people not making libraries out of their code. A reusable class may be hiding in a program directory and will never have the thrill of being shared because the programmer won't factor the class or classes into a library.

One reason for this is because people don't like making small libraries. There's something about small libraries that doesn't feel right. Get over it. The computer doesn't care how many libraries you have.

If you have code that can be reused and can't be placed in an existing library then make a new library. Libraries don't stay small for long if people are really thinking about reuse.

If you are afraid of having to update makefiles when libraries are recomposed or added then don't include libraries in your makefiles, include the idea of services. Base level makefiles define services that are each composed of a set of libraries. Higher level makefiles specify the services they want. When the libraries for a service change only the lower level makefiles will have to change.

Keep a repository

Most companies have no idea what code they have. And most programmers still don't communicate what they have done or ask for what currently exists. The solution is to keep a repository of what's available.

In an ideal world a programmer could go to a web page, browse or search a list of packaged libraries, taking what they need. If you can set up such a system where programmers voluntarily maintain such a system, great. If you have a librarian in charge of detecting reusability, even better.

Another approach is to automatically generate a repository from the source code. This is done by using common class, method, library, and subsystem headers that can double as man pages and repository entries.


Recent changes


Brandon Blackmoor
bblackmoor@blackgate.net
2010-05-30