I previously wrote about a model for studying a situation where individuals are nested in a higher-level group. That higher level group could be neighborhoods, cities or towns, medical practices, shopping clubs, etc. They could even be market segments. This last one is interesting because the whole purpose of market segmentation is to classify consumers into homogeneous groups so as to develop better marketing and pricing offers for them; better target marketing, if you wish. The process of segmentation, however, by definition, creates a nested structure since the individual consumers are nested in a segment and so segment characteristics influence their buying behaviors.
A (Fictional) Nested Problem
Let’s consider a problem involving price setting for stores in a retail chain. Stores come in different sizes (i.e., selling surfaces). There are small storefronts (e.g., Mom & Pops) and “Big Box” stores (e.g., warehouse clubs). Even within one chain, store sizes vary, perhaps due to geography. For example, stores vary in size between Manhattan and Central Jersey. Suppose we have (fictional) data on one consumer product sold through a retail chain in New England states. The chain has six stores: 3 in urban areas (i.e., small stores) and 3 in suburban areas (i.e., large stores). We also know the store sizes in square footage. We have data on the purchase behavior of 600 consumers. Each consumer’s daily purchases and the prices they paid were averaged to one annual number each so n = 600. We also know the consumer’s household income. We need to estimate a price elasticity for this product.
A Naïve Model Approach
A simple, naive model is a pooled regression based on a Stat 101 data structure. This means that all the data are in just one dataset with no distinction of store size or location. A demand model in log terms (log quantity, log price, and log income) could be estimated. The advantage of using logs is that the estimated price coefficient is the price elasticity. Here are the results of a pooled model.
The price elasticity is -2.4, so highly elastic.
A Slightly Better Approach
A better model would account for the store size, proxied by location: urban and suburban. The location is a reasonable proxy since large stores are in suburban areas. A location dummy would be defined that would be 1 for suburban stores and 0 for urban stores. Here are the results when a dummy variable is added.
The price elasticities are: Urban elasticity: -1.2; Suburban: -0.6. Intuitively, this should be accepted since urban stores face more competition. Food stores in Manhattan, for example, face stiff competition since there are several food stores/restaurants on each block. Suburban stores, however, are more spread out geographically so the value of time has to be factored in when shopping. This would make the demand for a product more inelastic.
A Multilevel Approach
The dummy variable approach would classify store size into small/large or urban/suburban. Store size, however, as a macro or level 2 context or ecological factor, may by a proxy for hard-to-measure factors such as customer characteristics, types of shopping trips (i.e., goals), shopping time constraints and so should be used. A product is nested in a store which is characterized by size, so the size has to be considered. Here are the results of a multilevel model with varying intercepts, each intercept representing a store size. Since there are six stores, there are six intercepts.
The estimated parameters for store 1 (which is small at 397 square feet) are:
Net Intercept: 2.47 + 0.11 = 2.58
Price Effect: -0.91 + 0.02 = -0.89
So, the estimated model is:
ln(Quantity) = 2.58 – 0.89 x ln(Price) + 0.62 x ln(Income)
The price elasticity is: -0.89.
A graph of the price elasticities is:
What Did We Gain?
Taking account of the store size, a level 2 macro variable, allows for a richer model and, therefore, better price elasticity estimates.