UCINET教程：社会网络数据分析工具详解

5星 · 超过95%的资源需积分: 44 93 浏览量更新于2024-07-29 2 收藏 7.04MB PDF 举报

UCINET教程是针对社会网络分析的专业软件工具，由罗伯特·A·汉内曼和马克·里德尔两位学者为加州大学河滨分校社会学系编写的在线教材的一部分。该教程主要聚焦于社会网络数据的研究方法，强调了这一领域与传统社会科学研究方法的独特之处。首先，社会网络数据（Social Network Data）的特点在于它关注的是个体之间的关系网络，而非单一的个体属性或行为。这些数据集通常包含节点（Nodes）和关系（Relations），它们构成网络结构，如人口、样本和边界。节点可以代表个人、组织、事件或其他实体，而关系则表示这些实体之间的连接或互动。节点在社会网络中具有多样性，可以是整个社会群体，也可以是研究中抽样选取的部分个体。网络分析关注的是不同层次和模式的分析，包括社会单位的集合（populations）、特定研究样本（samples）以及可能存在的边界条件，这些都会影响数据的解读和分析。社会网络中的关系不仅仅是简单的二元联系，可以有多重关系类型，比如单向、双向、等级、链式等，这增加了数据的复杂性。此外，社会网络中的关系可能存在模态（Modality），即关系的强度和方向可能有所不同，这在数据处理时需特别注意。测量尺度也是社会网络数据分析的一个关键要素，包括定性（如社会地位、影响力）和定量（如频率、强度）的度量，这对数据的分析方法选择和结果解释至关重要。在统计学上，社会网络数据需要特殊的统计技术来处理，因为它们不同于传统的独立或同质性较高的观测数据。尽管社会网络数据在表面上看起来可能与其他调查研究类似，但其独特之处在于其强调的是结构和互动，而非简单的个体属性或变量。通过UCINET这样的工具，研究人员能够处理大量数据，如文本文件、KrackPlot、Pajek、Negopy、VNA等格式的数据，并支持对32,767个网络节点进行分析。UCINET还整合了Pajek，一个适用于大型网络分析的免费软件，使得复杂网络的可视化和深入分析成为可能。 UCINET教程提供了全面的社会网络分析指南，涵盖了数据的收集、处理、分析直至可视化，适合于社会科学领域的学生和专业人士，帮助他们更好地理解和运用这一强大的分析工具。

Introduction to Social Network Methods: Chapter 1: Social Network Data

In one way, there is little apparent difference between conventional statistical approaches and

network approaches. Univariate, bi-variate, and even many multivariate descriptive statistical

tools are commonly used in the describing, exploring, and modeling social network data. Social

network data are, as we have pointed out, easily represented as arrays of numbers -- just like

other types of sociological data. As a result, the same kinds of operations can be performed on

network data as on other types of data. Algorithms from statistics are commonly used to

describe characteristics of individual observations (e.g. the median tie strength of actor X with

all other actors in the network) and the network as a whole (e.g. the mean of all tie strengths

among all actors in the network). Statistical algorithms are very heavily used in assessing the

degree of similarity among actors, and if finding patterns in network data (e.g. factor analysis,

cluster analysis, multi-dimensional scaling). Even the tools of predictive modeling are

commonly applied to network data (e.g. correlation and regression).

Descriptive statistical tools are really just algorithms for summarizing characteristics of the

distributions of scores. That is, they are mathematical operations. Where statistics really

become "statistical" is on the inferential side. That is, when our attention turns to assessing the

reproducibility or likelihood of the pattern that we have described. Inferential statistics can be,

and are, applied to the analysis of network data. But, there are some quite important

differences between the flavors of inferential statistics used with network data, and those that

are most commonly taught in basic courses in statistical analysis in sociology.

Probably the most common emphasis in the application of inferential statistics to social science

data is to answer questions about the stability, reproducibility, or generalizability of results

observed in a single sample. The main question is: if I repeated the study on a different sample

(drawn by the same method), how likely is it that I would get the same answer about what is

going on in the whole population from which I drew both samples? This is a really important

question -- because it helps us to assess the confidence (or lack of it) that we ought to have in

assessing our theories and giving advice.

To the extent the observations used in a network analysis are drawn by probability sampling

methods from some identifyable population of actors and/or ties, the same kind of question

about the generalizability of sample results applies. Often this type of inferential question is of

little interest to social network researchers. In many cases, they are studying a particular

network or set of networks, and have no interest in generalizing to a larger population of such

networks (either because there isn't any such population, or we don't care about generalizing

to it in any probabilistic way). In some other cases we may have an interest in generalizing, but

our sample was not drawn by probability methods. Network analysis often relies on artifacts,

direct observation, laboratory experiments, and documents as data sources -- and usually

there are no plausible ways of identifying populations and drawing samples by probability

methods.

The other major use of inferential statistics in the social sciences is for testing hypotheses. In

file:///C|/Documents%20and%20Settings/hanneman/My%2...s/Network_Text/Version2/C1_Social_Network_Data.html (16 of 18)3/17/2005 11:28:45 AM

Introduction to Social Network Methods: Chapter 1: Social Network Data

many cases, the same or closely related tools are used for questions of assessing

generalizability and for hypothesis testing. The basic logic of hypothesis testing is to compare

an observed result in a sample to some null hypothesis value, relative to the sampling

variability of the result under the assumption that the null hypothesis is true. If the sample

result differs greatly from what was likely to have been observed under the assumption that the

null hypothesis is true -- then the null hypothesis is probably not true.

The key link in the inferential chain of hypothesis testing is the estimation of the standard

errors of statistics. That is, estimating the expected amount that the value a a statistic would

"jump around" from one sample to the next simply as a result of accidents of sampling. We

rarely, of course, can directly observe or calculate such standard errors -- because we don't

have replications. Instead, information from our sample is used to estimate the sampling

variability.

With many common statistical procedures, it is possible to estimate standard errors by well

validated approximations (e.g. the standard error of a mean is usually estimated by the sample

standard deviation divided by the square root of the sample size). These approximations,

however, hold when the observations are drawn by independent random sampling. Network

observations are almost always non-independent, by definition. Consequently, conventional

inferential formulas do not apply to network data (though formulas developed for other types of

dependent sampling may apply). It is particularly dangerous to assume that such formulas do

apply, because the non-independence of network observations will usually result in under-

estimates of true sampling variability -- and hence, too much confidence in our results.

The approach of most network analysts interested in statistical inference for testing

hypotheses about network properties is to work out the probability distributions for statistics

directly. This approach is used because: 1) no one has developed approximations for the

sampling distributions of most of the descriptive statistics used by network analysts and 2)

interest often focuses on the probability of a parameter relative to some theoretical baseline

(usually randomness) rather than on the probability that a given network is typical of the

population of all networks.

Suppose, for example, that I was interested in the proportion of the actors in a network who

were members of cliques (or any other network statistic or parameter). The notion of a clique

implies structure -- non-random connections among actors. I have data on a network of ten

nodes, in which there are 20 symmetric ties among actors, and I observe that there is one

clique containing four actors. The inferential question might be posed as: how likely is it, if ties

among actors were purely random events, that a network composed of ten nodes and 20

symmetric ties would display one or more cliques of size four or more? If it turns out that

cliques of size four or more in random networks of this size and degree are quite common, I

should be very cautious in concluding that I have discovered "structure" or non-randomness. If

it turns out that such cliques (or more numerous or more inclusive ones) are very unlikely

under the assumption that ties are purely random, then it is very plausible to reach the

file:///C|/Documents%20and%20Settings/hanneman/My%2...s/Network_Text/Version2/C1_Social_Network_Data.html (17 of 18)3/17/2005 11:28:45 AM

Introduction to Social Network Methods: Chapter 1: Social Network Data

conclusion that there is a social structure present.

But how can I determine this probability? The method used is one of simulation -- and, like

most simulation, a lot of computer resources and some programming skills are often

necessary. In the current case, I might use a table of random numbers to distribute 20 ties

among 10 actors, and then search the resulting network for cliques of size four or more. If no

clique is found, I record a zero for the trial; if a clique is found, I record a one. The rest is

simple. Just repeat the experiment several thousand times and add up what proportion of the

"trials" result in "successes." The probability of a success across these simulation experiments

is a good estimator of the likelihood that I might find a network of this size and density to have

a clique of this size "just by accident" when the non-random causal mechanisms that I think

cause cliques are not, in fact, operating.

This may sound odd, and it is certainly a lot of work (most of which, thankfully, can be done by

computers). But, in fact, it is not really different from the logic of testing hypotheses with non-

network data. Social network data tend to differ from more "conventional" survey data in some

key ways: network data are often not probability samples, and the observations of individual

nodes are not independent. These differences are quite consequential for both the questions

of generalization of findings, and for the mechanics of hypothesis testing. There is, however,

nothing fundamentally different about the logic of the use of descriptive and inferential statistics

with social network data.

The application of statistics to social network data is an interesting area, and one that is, at the

time of this writing, at a "cutting edge" of research in the area. Since this text focuses on more

basic and commonplace uses of network analysis, we won't have very much more to say about

statistics beyond this point. You can think of much of what follows here as dealing with the

"descriptive" side of statistics (developing index numbers to describe certain aspects of the

distribution of relational ties among actors in networks). For those with an interest in the

inferential side, a good place to start is with the second half of the excellent Wasserman and

Faust textbook.

Return to the table of contents of this page

Return to the table of contents of the textbook

file:///C|/Documents%20and%20Settings/hanneman/My%2...s/Network_Text/Version2/C1_Social_Network_Data.html (18 of 18)3/17/2005 11:28:45 AM

Social Network Analysis Primer: Why Formal Methods?

represent the descriptions of networks compactly and systematically. This also enables us to

use computers to store and manipulate the information quickly and more accurately than we

can by hand. For small populations of actors (e.g. the people in a neighborhood, or the

business firms in an industry), we can describe the pattern of social relationships that connect

the actors rather completely and effectively using words. To make sure that our description is

complete, however, we might want to list all logically possible pairs of actors, and describe

each kind of possible relationship for each pair. This can get pretty tedious if the number of

actors and/or number of kinds of relations is large. Formal representations ensure that all the

necessary information is systematically represented, and provides rules for doing so in ways

that are much more efficient than lists.

Using computers

A related reason for using (particularly mathematical) formal methods for representing social

networks is that mathematical representations allow us to apply computers to the analysis of

network data. Why this is important will become clearer as we learn more about how structural

analysis of social networks occurs. Suppose, for a simple example, we had information about

trade-flows of 50 different commodities (e.g. coffee, sugar, tea, copper, bauxite) among the

170 or so nations of the world system in a given year. Here, the 170 nations can be thought of

as actors or nodes, and the amount of each commodity exported from each nation to each of

the other 169 can be thought of as the strength of a directed tie from the focal nation to the

other. A social scientist might be interested in whether the "structures" of trade in mineral

products are more similar to one another than, the structure of trade in mineral products are to

vegetable products. To answer this fairly simple (but also pretty important) question, a huge

amount of manipulation of the data is necessary. It could take, literally, years to do by hand; it

can be done by a computer in a few minutes.

Seeing patterns

The third, and final reason for using "formal" methods (mathematics and graphs) for

representing social network data is that the techniques of graphing and the rules of

mathematics themselves suggest things that we might look for in our data — things that might

not have occurred to us if we presented our data using descriptions in words. Again, allow me

a simple example.

Suppose we were describing the structure of close friendship in a group of four people: Bob,

Carol, Ted, and Alice. This is easy enough to do with words. Suppose that Bob likes Carol and

Ted, but not Alice; Carol likes Ted, but neither Bob nor Alice; Ted likes all three of the other

members of the group; and Alice likes only Ted (this description should probably strike you as

being a description of a very unusual social structure).

file:///C|/Documents%20and%20Settings/RHanneman/My%2...Network_Text/pdf/net_text_pdf/C2_Formal_Methods.html (2 of 4)7/31/2008 6:11:34 AM

剩余321页未读，继续阅读

hk112358

粉丝: 0
资源: 1

UCINET教程：社会网络数据分析工具详解

Ucinet完整版教程.zip

ucinet使用教程

UCINET教程

Ucinet教程：用Excel数据生成网络结构图

UCINET中文操作教程

社会网络分析教程ucinet

UCINET6教程：社会网络分析详解

UCINET入门教程：快速启动与基本操作指南

UCINET实用教程：社会网络分析与整体网研究

ucinet6使用教程

最新资源