March 7, 2022 in Five-Minute Analyst

Text Mining National Security

SHARE: PRINT ARTICLE:print this page https://doi.org/10.1287/LYTX.2022.02.13

Recently, I found myself teaching about strategic messaging, and in the national security sector, there are quite a few documents across all levels. I chose to start with the top. Because I’m primarily in the national security line of work, I am choosing to explore the collection of National Security Strategy (NSS) documents. This type of quick-look analysis could be performed on anything. Perhaps you want to know what successful businesses put in their strategic documents? Or maybe mine the plethora of market forecast articles to see whether there is common guidance? The list can go on and on.

So much effort goes into national strategic documents. And the trickle creates many initiatives, programs, etc. Lately, the phrase “strategic deterrence” has been floating around, which to those with gray hair may feel like déjà vu. And it should. But how do we map this back to strategic guidance? What do guidance documents tell us?

NSS documents date back to when Reagan was president of the United States. For many years they were produced annually. Then, they started to become less frequent, and currently, we see them approximately once per presidential term of office. In total, and if you count the “interim” one produced last year, there are 19 documents to explore. In this five-minute analytic jaunt, I am going to look up the most common words in each document. First, the document should be, at a minimum, slightly scrubbed and put into an appropriate format. Sorry, no easy way around that. Then, using the tm package in R, we dig into the documents.

The tm package allows us to remove punctuation, change case and remove common words. It also allows us to look at the stem of a word versus the entire word. An example is strategi being a stem for strategic, strategies and strategically – although not strategery. Table 1 highlights the most common words in each NSS document.

1987 1988 1990 1991 1992 1993 1994 1995 1996 1997 1998 2000 2001 2002 2006 2010 2015 2017 2021
Reagan Reagan HW_Bush HW_Bush HW_Bush HW_Bush Clinton Clinton Clinton Clinton Clinton Clinton Clinton W_Bush W_Bush Obama Obama Trump Biden
soviet secur forc econom econom econom nation nation nation secur secur intern state nation secur nation will will will
forc strategi econom will nation nation will secur unit intern intern secur unit will nation unit secur state nation
nation econom secur secur must must econom will will must nation nation secur unit strategi secur global unit secur
state soviet nation new unit unit secur forc state state state state promot secur state advancin govern american strateg
militari militari defens nation secur secur forc state econom will will econom continu state will strengthen state econom guidanc
unit nation polit forc will intern region econom forc region unit unit nation develop unit will strategi secur interim
secur capabl soviet region intern state unit militari secur interest promot region intern must democraci intern econom nation american
econom forc militari soviet state will democraci democraci support nation region will will freedom develop promot intern partner econom
defens interest technolog unit effort effort american region intern econom effort promot engag threat challeng strategi nation govern intern
support support maintain defens region region effort unit effort also develop support strategi intern freedom build continu must technolog
will countri will threat america america militari peac american america cooper develop must govern govern invest strengthen militari world
capabl europ state world democraci democraci strategi interest region effort econom cooper threat use terrorist pursu support threat interest
  union         promot           also   region engag system   global

Table 1: Top words count by NSS document. To view additional years, scroll right on the table. Note how econom peaks in the early 1990s and drops off in the early 2000s. What other trends could we tease out?

In this quick analysis, I did not remove the stems for the words national, security or strategy. One could leave them in, depending on whether their presence in the document is worth noting. The top two rows are document year and sitting U.S. president. I’ve done some conditional formatting to highlight interesting points. One is the trailing off use of the word Soviet. Given historical context, this makes complete sense. We have yet to see either Soviet or any other country raise to the same relative level in recent documents. The other interesting finding is the stem econom. This one peaks under George H. W. Bush and then falls until it completely disappears at the end of Bill Clinton’s time in office and does not reappear until almost the end of Barack Obama’s second term in office. It has yet to return to the same relative level of importance. A fitting follow-on study to this might be comparing this finding with the actual economic conditions that existed during and in years following each of these documents.

Table 1 also shows some obvious new words at appropriate places, such as the words terrorist and freedom in 2006. Using the word cloud package in R as well as generic boxplots, with some grouping of the data, we can quickly provide informative visuals. Some might find it useful to look at how the two parties compare. Let’s look at the word clouds (Figure 1) and the top 10 words for each. The differences in frequency are not very large, and most of the words overlap between the two parties. Economy seems to be higher among Republicans, whereas intern, which is a stem for international, is more common for Democrats. Militari, which is a root for some obvious words, actually shows up in the top 10 for Republicans but not Democrats. Republicans also use forc[e] at a higher frequency in the NSS documents.

word cloud for Republicansword cloud for Democrats

Figure 1a and b: Word clouds of the entire collection of NSS documents split into those produced under Republican (a) and Democrat (b) presidents, respectively.

bar plot for Republicansbar plot for Democrats

Figure 2a and b: Rather than the busy word cloud, we can plot the 10 words for each political party. Notice how many of the words overlap, but vary in their relative position.

We can also look at the political party bar plots stacked together with all words represented (see Figure 2). This format helps us more easily visualize where differences exist between the two parties. The final visualization comes from grouping results by sitting president. Note that with each grouping, we reshuffle the top results; thus, grouping by president will yield slightly different results than looking at each individual document, as we began to do in this five-minute look. Which grouping is best depends on the question being asked. In Figures 3 and 4, we can observe that more recent NSS documents have higher percent differences within the top 10. In this example, the difference is small, but it shows a trend in how these documents are written. More recent documents have perhaps a more repetitive nature.

bar plot combining Republican and Democrat
Figure 3: Perhaps a better view of the Figure 2 that allows us to see all the words where they disagree.
bar plot by president
Figure 4: Rather than a roll-up of political party or viewing data on every document, this figure groups by president, which provides a quick look at a larger pool of words. It is interesting that more recent documents have higher bars for fewer words, which appears to indicate more repetition. What other insights can be teased out from this visual?

In closing, let’s think about how this same analysis could be used elsewhere. Within national security, this kind of analysis could be used to analyze one’s own messaging, but also that of an adversary. It could also be used to explore the content of any number of other platforms and messaging products. The tools exist to do this quickly and easily. Just add data and stir.

Nick Ulmer, CAP
([email protected])

SHARE:

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.