Changes in Muslim Nations’ Centrality Mined from Open-Source World Jihad News: A Comparison of Networks in Late 2010, Early 2011, and Post-Bin Laden

This research analyzes the changes in Muslim nation (MN) networks and semantic networks associated with Jihad linked with three recent periods: 1) the late 2010 period, the early 2011 Muslim Middle East and North Africa uprisings and 3) the takedown

Please download to get full document.

View again

of 8
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.


Publish on:

Views: 3 | Pages: 8

Extension: PDF | Download: 0

    Changes in Muslim Nations’ Centrality Mined from Open-Source World Jihad News: A Comparison of Networks in Late 2010, Early 2011, and Post-Bin Laden  James A. Danowski, Ph.D. Dept. of Communication University of Illinois at Chicago Chicago, USA  Abstract   —This research analyzes the changes in Muslim nation (MN) networks and semantic networks associated with Jihad linked with three recent periods: 1) the late 2010 period, the early 2011 Muslim Middle East and North Africa uprisings and 3) the takedown of Osama Bin Laden. Mined were transcripts of web sites, broadcasts, newspapers, and other content captured for 46 Muslim nations. Results show that Somalia made the largest move upward across the three periods, increasing 21 times in network centrality. Iran is consistently in the top 2 positions. The network increased in link strength and indegree but became less structured in the early uprising period, and continued the decline in structure in the post-Bin Laden period, results consistent with crisis effects. Words paired with ‘jihad’ that increased and decreased in the early uprising and post-Bin Laden periods revealed messages that reflected major changes in substantive content in the three periods. The results appear to have face validity, and demonstrate how mining open-sources for inter-nation networks and for semantic networks about a topic of interest, in this case: ‘jihad,” can provide quantitative evidence with statistical tests that have intelligence and security implications.  Keywords-Muslim nations; jihad; web mining; semantic networks; social network analysis; international networks; Middle- East uprisings, Osama Bin Laden takedown; cyber security I.   I  NTRODUCTION  The web contains a wide variety of open sources of intelligence material regarding Jihad and Muslim nations. One source of interest contains transcripts of broadcasts, newspaper stories, and other documents, Lexis-Nexis. This research examines two main events against a benchmark of prior documents, the early 2011 uprisings of the Middle East and North Africa (MMENA), and the takedown of Osama Bin Laden. Of  particular interest is how the concept of Jihad may have changed in these sources, and how the relationships among nations with majority Muslim populations may have changed in terms of their interrelationships represented in these sources. Analysts can evaluate these changes for their security significance. II.   B ACKGROUND AND LITERATURE REVIEW  Social networks in relation to Muslim Middle-East uprisings have been of interest for at least nearly twenty years [1]. Like most social network (SNA) research, that study analyzed networks among individuals as nodes. Nevertheless, network analysis of the MMENA region using nations as nodes has also  been useful. Studies have been done of the MMENA network of telephone call traffic [2]. The Dark Web project has used data mining methods to gather web content from around the internet that has a jihadi focus [3]. While these projects have been useful open sources of intelligence about jihadi activity on the web, additional research, such as presented here, using SNA methods to map the links among the 46 MNs and profile them both in terms of network structure and jihad content in major news sources such as in transcriptions of broadcasts, newspaper stories, and other web content, is also of potential value. The recent wave of MMENA uprisings beginning earlier this year, and continuing, have been so recent that scholarly research has yet to catch up with them. Mainly news and opinion in the  popular press are the only sources available. While some of these documents are analytical treatments with intelligence value, it is  perhaps useful to conduct research using quantitative scholarly methods. Each nation of the world has a unique top-level domain name, such as .id for Indonesia and most intra-country web pages and other internet information is coded for that domain name. This is one way in which the science of Webometrics [4] represents domain interlinkage.. The research questions include: 1) how has the centrality of Muslim nations extracted from news documents changed from a baseline period to the early 2011 MMENA uprising period, and from that period to the post-Bin Laden  period? 2) how has the concept of ‘jihad’ changed in association to other words, comparing these three periods? III.   R  ESEARCH DESIGN AND METHODOLOGY  The period from January 1, 2011 to April 30, 2011 is treated as the period of early 2011 Muslim Middle-East and North Africa uprisings. The period of May 1, 2011, when the takedown of   Osama Bin Laden was announced, until May 16, when the current data collection was completed because of the paper submission deadline, constitutes the second major event comparison period. Given that the uprising period comprised four months, a benchmark set of data were collected for the last four months of 2010, from September 1 to December 31. The data for this research was extracted from full text documents downloaded from Lexis-Nexis Academic’s “major world publications” source, which contains transcripts of  broadcasts, newspaper stories, web documents, and other textual material. Within each of the three periods the same search strategy was repeated. It obtained the full texts of all documents that contained two terms: jihad and each of 46 Muslim nations’ names. The term ‘jihad’, the Boolean connector ‘and’, followed  by the individual country name formed the queries. These countries were 46 nations containing majority Muslim  populations. One country was eliminated because it has no top-level domain, Kosovo. The reason for this exclusion is that another paper mines web hyperlinks among these MNs tlds, as shown in Table 1, and also searches for the word ‘jihad’ in web content. In other words, 46 searches were repeated three times, once for each comparison period. Documents were obtained in  plain text format and combined into a single file for each period. There were 26.7 mb of text for period one, 41.5 mb for period two, a 55% increase from the 4-month period benchmark period to the four month early 2011 uprising period, and 23.2mb for  period three, 17 days after the takedown of Bin Laden. The software suite WORDij [5] was used for five types of analysis of these documents: 1) all words appearing within three  positions of one another were counted as word pairs and their frequencies cumulated, using the WordLink program in WORDij; 2) the words within two steps of the word ‘jihad’ were extracted from the total word pair lists using the program NodeTric, for node-centric analysis, in WORDij; 3) an include list of country names (coverted to top-level domain codes (tlds)) was used to count country pairs that appeared within three country name  positions in each sentence of each document, creating a country network data set for each period, using different options of the WordLink program; 4) country pairs that significantly increased and decreased from period one to two and two to three were identified using Z-Pairs software in WORDij; and 5) the word  pairs in the node-centric network surrounding ‘jihad’ were also identified that significantly increased and decreased from one  period to the next.  Next, two kinds of network analysis were performed, one of country-country networks and the second of word-networks surrounding ‘jihad.’ Although WORDij has a graphing utility that animates changes over time in networks, the features of Ucinet [6] and NetDraw [7] are more robust for computing network statistics such as flow betweenness centrality [8][9] and for graphing the networks with flexible coding options, such as resizing nodes according to flow betweenness centrality. Figs. 1 shows the breakdown of kinds of documents extracted. Figs. 2-4 show the county networks identified in each  period, and Figs. 5-7 show the semantic networks for ‘jihad’ in each period. To foster visual utility, word pairs less frequent than 50 occurrences were dropped in each word network. A small stop word list containing basic grammatical function words was used to drop such words during the WordLink analysis. For the country networks identified in WordLink using the include list (as opposed to stop list) option all frequencies were used. Figure 1. Distribution of Lexis-Nexis Sources Used Table 1. Key to Top-Level Domains (TLDs) TLD Country TLD Country af Afghanistan my Malaysia al Albania mv Maldives dz Algeria ml Mali az Azerbaijan mr Mauritania  bh Bahrain ma Morocco  bd Bangladesh ne Niger  bn Brunei ng Nigeria  bf Burkina Faso om Oman td Chad pk Pakistan km Comoros qa Qatar dj Djibouti sa Saudi Arabia eg Egypt sn Senegal gm Gambia sl Sierra Leone gn Guinea so Somalia id Indonesia sd Sudan ir Iran sy Syria iq Iraq tj Tajikistan  jo Jordan tn Tunisia kz Kazakhstan tr Turkey kw Kuwait tm Turkmenistan kg Kyrgyzstan ae U.A.E. lb Lebanon uz Uzbekistan ly Libya ye Yemen The measure of centrality most appropriate to communication data is flow betweenness centrality, rather than betweenness centrality [10]. In its formal definition, betweenness centrality has assumptions that each link in coded as 0 or 1, present or absent, that positions of nodes on only the single shortest path is of interest, the geodesic. The number of times that each node appears on a shortest path linking two other nodes is aggregated and normalized to produce the betweenness measure. The Freeman betweenness centrality measures each actors’ positional advantage, actors "between" other actors, and on whom other actors are presumed to be dependent on to conduct communication with one another through the more central intermediary nodes with high betweenness. In contrast, imagine two actors want to exchange messages,  but the geodesic path between them is blocked by a uncooperative or undesirable intermediate node. If another path exists around this blockage, the two nodes will be inclined to use it even though it is not the shortest path. Nodes may also chose to communicate multiple messages through different paths at the same time or close to the same time. This assumption fits the reality of human communication and other communication  behaviors more than the restriction of communication only to one shortest path. Nodes are more often free to use any path connecting them, and may find this more fitting to the realities of the situation than being restricted only to the geodesic paths. Flow betweenness assumes that nodes will use multiple  pathways, proportional to their length and valued strength. As   implemented in Ucinet and described in its online help section [6]: “flow betweenness is formally computed as follows: “Let mjk be the amount of flow between vertex j and vertex k which must pass through i for any maximum flow. The flow  betweenness of vertex i is the sum of all mjk where i, j and k are distinct and j < k. The flow betweenness is therefore a measure of the contribution of a vertex to all possible maximum flows. The normalized flow betweenness centrality of a vertex i is the flow betweenness of i divided by the total flow through all pairs of points where i is not a source or sink. For a network with vertices and maximum flow betweenness centrality cmax, the network flow betweenness centralization measure is S(cmax - c(vi)) divided by the maximum value possible, where c(vi) is the flow betweenness centrality of vertex vi. IV.   R  ESULTS  Table 2 shows for each country in the network based on cooccurrences of countries in the documents analyzed in each  period, the flow betweenness values, ranking the countries within each period. The values shown are the normalized ones, so comparisons can be directly made in the values even though the numbers of countries present are different in each period. T ABLE 2.   F LOW B ETWEENNESS C ENTRALITY BY TLD FOR 2010,  EARLY 2011   U PRISINGS ,  AND POST B IN L ADEN T AKEDOWN   L ATE 2010   U PRISING P OST B IN L ADEN   af 12.33 pk 9.28 ir 11.83 ir 9.90 ir 7.41 so 10.87 uz 7.86 ng 6.88 eg 8.83 sd 7.57 sd 6.28 af 8.68  pk 6.28 uz 6.25 uz 5.63 dz 6.16 af 5.37 pk 4.51 sa 5.00 eg 5.19 sa 4.06 ne 4.65 ml 4.84 sy 3.03 ye 4.26 dz 4.65 ye 2.99 ng 3.52 gn 4.65 ml 2.89 tr 3.06 tr 4.49 dz 2.86 tj 2.97 sa 3.80 iq 2.74 mr 2.65 tj 2.98 ma 2.72 iq 2.59 ye 2.90 bh 2.35 az 2.51 kz 2.61 sd 2.33 td 2.36 ne 2.59 tn 2.17  bf 2.30 so 2.46 id 2.07 kw 1.91 kg 2.45 ly 1.40 eg 1.85 ly 2.33 kw 0.94 sn 1.81 bf 2.14 tj 0.92 id 1.75 sl 2.13 tr 0.89 sy 1.68 id 2.06 jo 0.74 ma 1.68 gm 1.93 lb 0.52 ml 1.25 sn 1.86 qa 0.51 lb 1.20 bh 1.75 om 0.29 tn 1.16 iq 1.73 mr 0.24 ae 1.10 jo 1.69 bd 0.08  bh 0.91 mr 1.64 td 0.07 gn 0.82 td 1.47 Ae 0.08  jo 0.71 lb 1.44 al 0.01 al 0.64 az 1.22 az 0.00 ly 0.62 ma 1.16 my 0.00 kg 0.61 sy 1.09 kg 0.00 so 0.52 tn 0.92 ng 0.00 kz 0.32 qa 0.71 ne 0.00 my 0.27 bd 0.68 dj 0.00 qa 0.27 kw 0.66 bf 0.00 tm 0.21 ae 0.46 gm 0.00 om 0.19 my 0.21 kz 0.00  bd 0.07 om 0.11 tm 0.00 gm 0.00 tm 0.02 mv 0.00 dj 0.00 dj 0.00 al 0.00 sl 0.00 mv 0.00 km 0.00 Figs. 2-4 show the networks among countries in the jihad documents. If a country appeared together with another within three country positions on either side within each sentence (recall that all but country words are dropped), it is counted as linked. The graphics give an overall visualization of country flow  betweenness centrality differences with the size of the country nodes scaled according to this centrality. Table 2 shows the numerical values for flow betweenness centrality for each country in each of the three periods. We note the following results. Somalia makes the largest moves, jumping from rank 35 and a normalized centrality score of .52 to rank 17 and a score of 2.46 from baseline period to the early uprising. That is a 4.7 times increase. It continues to move upward in the post-Bin Laden  period to a rank of 2 and a score of 10.87, an additional 4.4 times increase. Egypt makes a large move in centrality. It moves from rank 19 and a normalized value of 1.85 to rank 7 at a normalized value of 5.19, increasing by a factor of 2.8 times, then moves to rank 3 and a value of 8.83, a 78% increase. Mali goes from rank 24 in the benchmark period to rank 8 in the early uprising period, centrality increasing by 3.9 times, then moving down to rank 10, and dropping in centrality by 67%. Libya begins at rank 32 in the benchmark period and moves to rank 19 in the early uprising period, a change in centrality of 3.8 times higher. Then in has a minor rank change to 18 in the  post-Bin Laden period, with a 64% decline in centrality. Bahrain moves from a benchmark rank of 28 to 24, with an increase in centrality of 92%, and continues upward to rank 14, gaining 34% in centrality. Iran consistently is in the top two in each period but drops 34% during the uprising period but later increases 61% in the  post-Bin Laden period. Sudan was relatively high in period 1 and 2 at rank 4 in each, then dropped to rank 15 in the post-Bin Laden period, dropping  by 67% Pakistan moves up from rank 5 in the benchmark period to rank 1 in the uprising period, going from 6.28 to 9.28, a 48% increase, but drops to rank 6 in the post-Bin Laden period, a decrease of a factor of 2 times.  Nigeria makes a move up in centrality from rank 10 in the  benchmark period to rank 3 in the uprising period, nearly doubling in centrality, but drops to zero in the post-Bin Laden  period. Kazakhstan makes a big move up from rank 36 in the  benchmark period to rank 15 in the uprising period, centrality changing from .32 to 2.51, an 8.2 times increase, but it drops to rank 39 with a centrality score of 0. Kyrgyzstan shows a similar pattern, going from rank 34 in the  benchmark period to rank 18 in the uprising period, a four-fold move in centrality, but drops to rank 45 and centrality of 0 in the  post-Bin Laden period.   Syria moves down from rank 22 in the benchmark period to rank 32 in the early uprising period but jumps to rank 8 in the  post-Bin Laden period, an 2.8 times increase. This may be associated with Syrian uprisings gaining momentum later than the early uprising period. Guinea makes a big move up from rank 29 in the benchmark  period to rank 10 in the uprising period, an 5.5 times increase in centrality, but disappears in the post-Bin Laden period. The next type of SNA is of the words in the documents. Words within 3 word positions on either side of each word were counted in terms of cumulative frequencies within each period, then the node-centric network around the word ‘jihad’ and moving two steps out from it was extracted. These networks are shown in Figs. 5-7. Note that because there are more total words in the early uprising period, the jihad-centric network is larger. Figure 2. Benchmark Period: Late 2010 Figure 3. Early 2011 Uprising Period MNs Network Flow Betweenness Centrality   Figure 4. Post Bin Laden Period MNs Network Flow Betweenness Centrality Figure 5. Jihad-Centric Network in Late 2010 Period
Related Search
Similar documents
View more...
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks