The Research Proposal Workshop

Created January 13, 2015.

This revision May 1, 2019.

In the Master course of the Department of Policy and Planning Sciences, your first assignment is the research proposal workshop (研究計画発表会). The task is to make a presentation of your plan to your advisory group (AG), gather their comments, and adjust your plan to address their criticisms and suggestions. Note that although you will be graded immediately on the presentation itself, the task is not complete until you have

It's also important to notice that the presentation is not about the presenter, it's about the audience. It's not appropriate to say what you want to say. You are not a poet or graphic artist. Rather, you need to focus on what your audience needs to hear and see to do their job of evaluating your proposal. I.e., you are marketing to them. Do a SWOT evaluation of your AG! (I am not joking.)

This is a general principle about formal research products (reports and presentations). Your research achievements are "for you," but the research products are "for the audience." Don't forget that!

Plan vs. Presentation

The presentation and the plan are not the same thing. The presentation is a "sample" of your thinking intended to impress the AG. You have about 10 minutes to make your presentation, which is not enough to describe everything you have done to prepare, everything you know about the topic, and everything you plan to do. The sample should not be random, however: the point is to appeal to the AG, to get them to give useful comments, and to make them interested in you.

So you select your points for presentation to

The plan, on the other hand, must be a complete description of

The Research Topic (or Theme)

Carefully consider the difference between the research area and the research topic or theme. In Shako, a research area can be described in terms of interesting facts, but a topic must be a question about a model that you propose to answer. Here, "model" is very general, and does not refer to any specific research techniques. It's simply a statement of relationships among facts that you propose to confirm or contradict through your research. In some fields like history and taxonomy, research may consist primarily of collecting facts and ordering them in a narrative.

Even that little bit of order is a model in the appropriate sense, but it is not acceptable to the Shako faculty. Shako requires a more formal, a priori model, usually involving a clear notion of causality. Here, "formal" means that the parts and relationships of the model are described explicitly, and a priori means that the model is described before the research activity is performed. Considerations outside of the model are unscientific, and will not influence the evaluation of your research technique. (On the other hand, such considerations are often useful to increase the appeal of your research.)

Furthermore, description (or even a new construction) of a model is not enough. It must be tested in some fashion, typically deduction of observable consequences (theoretical research) or application to data (empirical research).

Organization of the Proposal Presentation

The proposal presentation has four components (which need not be separate from each other, they may be mixed in the presentation according to the topic and the presenter's personal style). These components are

  1. an introduction to the subject area and your topic,
  2. a presentation of your model of the entities and relationships you want to study,
  3. a presentation of the results you hope to achieve (i.e., the questions you propose to answer about your model),
  4. and a description of the method you propose to use in your research.

It is conventional to present in the order above, but not necessary.

Introduction

In the introduction, you explain why it is worth the time and effort, not only for you but also for your AG, to pursue your research project. You should describe aspects that would interest the AG. It is usually a good idea to talk about what interested you in the first place: refer to some social phenomenon or policy problem, or a personal experience. But you must also provide evidence of the importance of the research to others. One way to do that is to draw policy implications.

Another is to demonstrate that the proposed research corrects a mistake or closes a gap in the existing literature. But to demonstrate a gap or mistake you need to conduct a literature review. In the auxiliary resources you provide you should include a formal list of sources you have studied in preparation for your research. A good literature review also is evidence of your preparation and competence to perform the proposed research.

Model

A model is a description of the entities to be studied and the relationships among them. Most likely most of your entities will be businesses and related things, such as industries, customers, suppliers, markets, and so on. The names of the "related things" here indicate the relationships. You will need to provide more depth than mere names, however.

The role of your model is to keep your research "on track". There is an infinite number of ways to approach any topic, and most of them lead out of the topic and even out of the research area unless disciplined by a model. The model states what things are important to your research, and how they fit together. If in the course of your research you discover that something important was left out, you need to stop and consider carefully how to continue. It is always possible to proceed with the same model, change the question to "is this model sufficient to explain the motivating phenomena?", and accept that the answer may be "no, it isn't". Alternatively, you may add the new entity or relationship to your model, but that may imply bringing in even more related things.

In Shako, models are expected to be at least somewhat formal. Of course many professors do theoretical research, using very formal mathematical or statistical models. That is not necessary. However, you cannot just collect facts and estimate regression equations using all the variables to hand, and expect to satisfy the Shako faculty. Without a semi-formal model in mind, you cannot formulate equations or statistical hypothesis tests. So you may as well make that model explicit.

Also, in Shako, models are expected to be somewhat general. A case study, in which you explain the behavior of a particular business firm, is not sufficient. You need to be able to apply your findings to other, similar businesses, including defining what degree of "similarity" makes your findings applicable to other businesses.

Results

A "research result" is the answer to a question about your research area. It might be the coefficient on a variable in a regression equation, or the whole equation, or the result of a hypothesis test.

So to describe the expected research results, you need to state the questions you want answers to. It is not enough to say you will "investigate" some topic, or "conduct a survey and analyze the data". To describe an expected result, you need to propose a relation among data you expect to observe. Your research is to actual observe the data and to demonstrate that the relationship is valid (and/or measure its strength), or perhaps that it is invalid.

Methodology

Methodology is how you intend to perform your research. For theoretical work, it is a description of the model and the mathematical techniques you will use.

In empirical work, you need to describe the data and its source, and how you can confirm its reliability. Then you need to discuss how you will extract the model relationships from the data, and test their significance.

Ontology graphs

An ontology graph (ograph for a short name) is also called an "ontology log" or "olog". "Ontology" is the branch of philosophy that studies entities (i.e., things that exist) and their relationships, such as "part and whole", "cause and effect", "member and set", "parent and child", and so on. A "log" is a record of history; I suppose the idea of "ontology log" is "the record of the entities I've learned about, and the things I've learned about their relationships". However, the figure itself doesn't necessarily contain that dynamic aspect of your learning process, only the current result. For that reason, and also because ologs have a formal definition, I will use the term "ograph" for our informal diagrams.

Ographs are very general and very abstract, so they may be hard to get a handle on them. If you prefer something more structured, there is a more structured graphical representation frequently used for business and policy models (less so for economics and engineering) called RAM (short for "reticular action model", but I don't know what that means myself). RAMs are described here.

Consider an example from marketing. We start from an almost trivial model where advertising leads to demand.

base_model base_cause Advertising base_effect Demand base_cause->base_effect causes

However, in our company we think of marketing as something more than just advertising, and we realize any information about our activity may affect demand. Furthermore, our customers are consumers, motivated by preference rather than profit.

modified_model base_cause Advertising modified_cause Firm Behavior base_cause->modified_cause is a kind of base_effect Demand base_cause->base_effect causes modified_effect Consumer Demand modified_cause->modified_effect causes base_effect->modified_effect is a kind of

But this ograph has a big problem: "demand" is not a "a kind of" "consumer demand", it's the other way around. We could just turn the arrow around:

modified_model base_cause Advertising base_effect Demand base_cause->base_effect causes modified_cause Firm Behavior base_cause->modified_cause is a kind of modified_effect Consumer Demand modified_cause->modified_effect causes modified_effect->base_effect is a kind of

but visually the arrows lead us back into the original model. That doesn't seem very good, because we just decided the original model isn't right for our business. So instead, since each arrow has its own special character, we relabel the "down" arrows:

modified_model base_cause Advertising modified_cause Firm Behavior base_cause->modified_cause generalizes to base_effect Demand base_cause->base_effect causes modified_effect Consumer Demand modified_cause->modified_effect causes base_effect->modified_effect specializes to

Note that we keep the original model. After all, we know about that model, too. This gives us another reason to keep that arrows between models pointing in the same direction: it leaves a trace of history of development. We can indicate that by highlighting our current model:

modified_model base_cause Advertising base_effect Demand base_cause->base_effect causes modified_cause Firm Behavior base_cause->modified_cause generalizes to modified_effect Consumer Demand base_effect->modified_effect specializes to modified_cause->modified_effect causes

If drawing by hand, you would be more likely to put a box around it, and add a label:

modified_model clustermodel Current Model base_cause Advertising base_effect Demand base_cause->base_effect causes modified_cause Firm Behavior base_cause->modified_cause generalizes to modified_effect Consumer Demand base_effect->modified_effect specializes to modified_cause->modified_effect causes

but, technically speaking, ologs don't have boxes. Instead, you would represent the box in a different way:

modified_model m0 Original Model base_cause Advertising m0->base_cause has component base_effect Demand m0->base_effect has component base_cause->base_effect causes modified_cause Firm Behavior base_cause->modified_cause generalizes to modified_effect Consumer Demand base_effect->modified_effect specializes to modified_cause->modified_effect causes m1 Current Model modified_cause->m1 has component modified_effect->m1 has component

That's kind of ugly, and I think humans have a hard time understanding it. Colors (or maybe boxes) are better. I mention the point about ologs not having boxes because modeling is a balance between expressiveness (to come "close to reality") and discipline (so our research can "fit into our heads", and so that software can help us do analysis). Avoiding boxes (which are not part of the "true" olog form of expression) is a kind of discipline.

"True" ologs don't have colors, either, but they help humans to see the kind of entity or relationship without reading the labels. So if you use colors, make sure the use of color is consistent with the meaning of your labels. And feel free to use boxes if you want and they look good!

One last change. Since the point of the "new" model is that we have generalized the the firm behaviors, we might like to know the specific character of the new behaviors. Here is one example:

modified_model base_cause Advertising modified_cause Firm Behavior base_cause->modified_cause generalizes to base_effect Demand base_cause->base_effect causes channel1 Advertising modified_cause->channel1 by marketing department channel2 Promotional Pricing modified_cause->channel2 by marketing department channel3 Nonmarketing Behavior modified_cause->channel3 by other departments modified_effect Consumer Demand channel1->modified_effect causes channel2->modified_effect causes channel3->modified_effect causes base_effect->modified_effect specializes to

As perhaps you can see, this ograph expresses not only the cause and effect relationship between firm behavior and consumer demand, but also

It doesn't express anything about external factors (including other firms' behavior, consumer preference, the weather, or the state of the economy). You might want to add some of them to the ograph, or you might want to generalize "other departments" to "other factors" (meaning factors the marketing department doesn't control but needs to consider because they affect the success of marketing activity). Which way you improve the ograph reflects a decision about the scope of your research theme.

Scheduling

Start by making a calendar of important events until graduation. Most of the dates you should already know will be deadlines imposed by the school including those for submission of forms and reports, and formal presentations for evaluation. You should also include exam dates of classes. Some events are not deadlines, such as

  • graduation itself, and
  • any vacations you plan to take (this includes time off due to visits from family and the like as well as your own travel), including the period between normal thesis defense and graduation (for the unlikely event that you have unsatisfied requirement that can be satisfied in that interval), and
  • other events you might think of that you need to spend time preparing for.

Next you should make a broad plan for your own research goals. First, you should be aware that formally, your research period ends with the submission of the "kari-toji", or draft submitted thesis in November, not with the submission of the bound thesis in January. In theory the only changes made from November to January should be specific changes requested by the advisor and the AG. (In practice there will be more flexibility, but I don't want you to count on it. In particular, in your "midterm" presentation you may never say "I plan to do more ..." unless you're continuing to the PhD program!)

Then make a list of activities you will be performing on a regular basis for extended periods of time. If the time period is limited, such as a class, you should indicate the time period. Intensive lecture courses that happen in one day or a few consecutive days can also be treated as "events", whichever seems more natural to you.

Outline

Reticular Action Model (RAM)

A particular form of olog used in statistical analysis of psychology (including marketing, sociology, behavior economics and any other field that depends on realistic models of human choice rather than optimization) is the diagrammatic form of the McArdle-MacDonald reticular action model (RAM). This section follows the discussion in Kline Principles and Practice of Structural Equation Modeling, Ch 6. Note that the RAM is a mathematical specification restricting the form of statistical models. Here the focus is on a particular visual representation of such models using network diagrams.

The basic idea of the RAM is that there are characteristics (such as gender, quantity, or brands) of various entities (such as consumers, products, and firms, respectively), and there are relationships between them. Characteristics are divided into two types: manifest (visible to the researcher) and latent (invisible to the researcher). The relationships are also divided into two types: causal relationships (such as market price to consumer demand) and correlations (usually in the context of error terms in statistical modeling, but sometimes where causality is bidirectional).

RAM diagrams use a simple notation. Variables are represented by shapes: rectangles denote manifest variables, and ellipses denote latent variables. Relationships are denoted by arrows: causal arrows have arrowheads at one end (the effect) but not the other (the cause), while correlations have arrowheads at both ends.

Let's see how this looks in practice. According to microeconomics, price and quantity in a market are related to each other, and in any given market they tend to move either together or in opposite directions. You can give a plausible explanation for either direction. As prices rise, consumers tend to avoid that good, so quantities "should" decrease. But producers will want to produce more, so quantities "should" increase. Which will happen depends on outside factors not described here, so this is a correlation:

%3 left Price right Quantity left->right correlation

Note that this RAM graph is exactly like an ograph. In fact it is an ograph. The difference is that the RAM graph has to obey certain rules, and the latency attribute is expressed by the shape of the node.

On the other hand, we can consider that we have two forces, both generated by price, and both acting on quantity:

%3 left Price right Quantity left->right supply left->right demand

This model has three problems. The first is that it's not economically accurate. Price has causal effects on quantity demanded and quantity supplied, not on market quantity. In the usual model a mismatch between the demanded and supplied quantities has a causal effect on price. This is why the arrow in the first diagram is double-headed: the causal effects go in both directions. That's enough said about that. The second is that this RAM is not identified, which will show up in a regression model as extreme multicollinearity. That's a topic for a different session.

The third problem is that the effect variable isn't (market) quantity, which is manifest, but rather quantity demanded or quantity supplied, at least one of which is latent (at least in disequilibrium). Let's look a proper RAM for the demand relationship:

%3 left Price right Quantity demanded left->right demand

Although there's nothing wrong with this last model, let's consider a variation. Normally prices are posted in a market, and so are visible to the researcher. But consider that the consumer might have a discount coupon, and researcher doesn't know whether he does or not. Then the price is also latent!

%3 left Price, maybe with coupon right Quantity demanded left->right demand

Finally, neither of the last two models is very useful because by definition they can't be treated empirically: at least one of the variables has no data, by definition of "latent". Let's consider a complete model:

%3 left Price center Quantity left->center demand right1 How much do you like it? [1-5] center->right1 Q1 right2 How much would you pay for it? [$0-$500] center->right2 Q2 right3 Look at the ad. Will you buy it? [no, yes] center->right3 Q3

Although you can't measure Quantity directly, you can measure the responses to the questionnaire, and by adding up the responses to the questions, you can estimate "buy" vs. "not buy" for one consumer based on their answers because those answers depend only on the actual quantity the consumer wants at any given price. In the model with price manifest and quantity latent, however, you have no way to distinguish between consumers who will buy and those who won't, so you can't estimate demand.

A manifest variable can be measured directly, because that's the definition of manifest variable. A latent variable can't be measured by definition, but if it causes manifest variables to change, then it can be estimated by observing and aggregating those manifest variables.

A final comment: the last model is complete in a scientific sense, but it lacks a way to deal with errors. By adding random errors as latent variables in appropriate places, a RAM can express not only the "scientific" cause and effect relationships, but also the causes of "statistical variation" in measured variables.