You are currently browsing the monthly archive for October 2006.
Evans talks about a “Deep Model” when he discusses refactoring, and states:
A deep model provides a lucid expression of the primary concerns of the domain experts and their most relevant knowledge while it sloughs off the superficial aspects of the domain.
Most modeling at least starts out as a “find the nouns and verbs” game, but the key is that it shouldn’t stop there. I think this overlooked key is the primary reason why refactoring to a deep model is difficult. A developer has to be listening very carefully to domain experts in order to identify some of the subtle behaviors that may be taken for granted by the expert. Most applications are not physical simulation of their constituent nouns, so it makes perfect sense that the best model will not simply be concrete nouns and verbs, but representations of underlying relationships and behaviors.
Particularly with OO programming, many developers have a habit of viewing physical objects as model objects, overlooking the possibilities of behavioral objects. Evans includes Constraints, Processes, and Specifications (predicates) as good examples of explicit behavior. Essentially, abstracting procedural code into behavioral models does two things:
- It provides flexibility in replacing and augmenting behaviors, which will in turn provide flexibility for domain growth.
- It raises the importance of the behavior by naming it and giving it a place in architectural documentation or diagrams, where previously it would only be a few sentences describing an otherwise-anonymous process nested within API documentation.
So why does this make refactoring hard? Because it’s design. Most refactoring discussions are exclusively code-level (and machine-assistable, implied by Danny Dig and other refactoring researchers). The level of refactoring that Evans focuses on is not code “cleanliness” or any sort of mathematical graph-partitioning problem. It is the expressiveness of the model itself, and the process of converting a Nouns-n-Verbs Model into a Deep Model.
Software developers are rarely domain experts, so the biggest barrier is knowledge sharing and communication. Without domain experts pointing out the weakness, awkardness, or inflexibility of a design, software developers are left to figure it out themselves, more by chance (lucky modelling guess) or coincidence (mechanical refactoring clarifies the model as a side-effect) than by actual knowledge (domain research).
You know what really grinds my gears? Reading “PHP Security” articles on the internet and discovering nothing but crap in them. I keep seeing people attempting to work around register_globals, or using regular expressions to attempt to filter data that they should merely be escaping.
The only good source I’ve seen is Chris Shiflett, and this is probably going to be the same as what he says:
- Disable all
magic_quotes(or include a
stripslashes()in your input filtering process)
- Filter all input (regexes are ok for this) to bring it into a “pure” state
- Escape all output that can be traced back to user input, so malicious input cannot alter the syntactical structure of the output
- SQL querys are output
- HTML is output
- Shell execution is output
- Use SafeHTML if you really need to allow HTML syntax… but realize that you’re still not guaranteed to be safe and design your application to be minimally dangerous when compromised
E_STRICTtoo, if you’re using PHP5 exclusively)
- Log errors internally instead of printing them for production servers
- Define all variables before use
- Don’t store sensitive information or logic client-side
- Cookies can be hijacked
- Be aware of character encodings
htmlentities()et al are encoding-aware and can become XSS vulnerabilities
- Most database software is encoding-aware, and
addslashes()can become a SQL injection vulnerability
The one thing I was looking for was session-ID protection, to minimize the damage possible if a malicious user does manage to grab cookies using a XSS exploit.
The bulk of the text in these chapters (especially 4 and 6) revolve around separating “business logic” from … everything else. Chapter 4 discusses the concept of a Layered Architecture and how it furthers DDD. I would consider this a rather basic natural progression for growing developers. Even with completely ad-hoc development, layers will naturally coalesce:
- Infrastructure: Most programs are based upon collections of libraries, because it reduces the effort required to get something done.
- User Interface: This gets a bit iffy at times, because the UI code is often abstracted into part of the Infrastructure (e.g. I am writing a Swing application), and really that’s only half of the battle. It’s less natural to separate GUI API-calling code from the application, but the use of a library at all is a nudge in the right direction.
The description of Factories and Repositories seems rather extreme, but it does follow in line with unit testing. Testing a domain object’s behavior should be as separated as possible from testing the object persistance, because the persistance is really only a means to an ends. From personal experience, I also know that it’s a royal pain in the ass to set up test harnesses for a large DB schema, and the ability to separate that from all of the more meaningful (read: less infrastructural) testing is a definite boon.
In a way, I see the separations as a sort of two-dimensional cut. Layering the software provides several strata to handle, and separating domain “business logic” from “domain model persistence” slice the domain stratum into the meat and bones of a product, respectively.
Chapter 5 focused mostly on the pragmatic aspects of domain modeling. Namely, how can I apply these “pie in the sky” ideas to a real project? While it only touches lightly (addressing only a few technical issues of realizing a domain model), it is nice to see, and I hope to see more in later chapters. Without concrete grounding, no design concept can be adopted in the real world.
I’m about halfway through chapter 2 and the discussion on vocabulary has really spoken to me. Quite a bit of my accomplishments (or lack thereof) at work have been related to things Evans has suggested:
A nasty habit I’ve seen (out of myself and others) is the belief that a generic framework will obviate domain knowledge. I’ve written plenty of small frameworks, and I’m convinced that the only good framework is a domain-specific framework – anything else is best left as a library.
I’ve also run into cases where a domain model has grown without any experts (not because no experts were available, but because the domain wasn’t entirely focused yet). Vocabulary was fickle:
- Should a user’s computer (identified by a MAC address) be “hardware”, or “device”, or “computer”, or something else?
- For that matter, should a user be instead “customer” or “client”?
- And what about the customer’s personal data: “account” or “profile”? A customer has a username and password, which sounds like “account”, but what if the domain grows and a customer can have multiple usernames tied to a single payee?
The end result of a shifting domain and insufficient forethought into the model vocabulary is that some model names have drifted:
- Between “device” and “hardware”, it made most sense to refer to a term that nontechnical users may be familiar with: “device” was chosen. While in-page text was simple to convert, some web URIs still now refer to the original term, “hardware”.
- Between “customer” and “client”, it made sense to follow the nomenclature of the underlying billing software: “customer” was chosen. However, the product was initially developed to be as independent as possible from the billing software, so there are scattered references to Client in the source code. The most awkward is seeing a block of code that starts with
cust = new Client(...)