Việc nghiên cứu các kỹ thuật phát hiện sao chép thu hút được nhiều sự
quan tâm của các nhà nghiên cứu trong và ngoài nước. Do vậy, luận án đã đề
xuất hướng nghiên cứu liên quan đến lớp bài toán này. Qua thời gian nghiên cứu,
thấy rằng các đề xuất liên quan đến bài toán phát hiện sao chép vẫn còn một số
hạn chế như: các đề xuất giải quyết các trường hợp sao chép có sự thay đổi chưa
thực sự hiệu quả và vấn đề ứng dụng các kỹ thuật phát hiện sao chép cho văn
bản tiếng Việt còn nhiều hạn chế. Chính vì vậy, hướng nghiên cứu của luận án
là cần thiết. Luận án đã đạt được mục tiêu là đề xuất các kỹ thuật liên quan đến
bài toán phát hiện sao chép toàn cục, xây dựng các kho ngữ liệu tiếng Việt và
cải tiến các kỹ thuật đã đề xuất thử nghiệm trên kho ngữ liệu này góp phần khắc
phục các hạn chế đã nêu.
Các kết quả của luận án đạt được là:
- Nghiên cứu về bài toán phát hiện sao chép toàn cục; phân tích, đánh giá
ưu nhược điểm của các hướng nghiên cứu liên quan đến hai bài toán thành phần
gồm bài toán trích rút từ khóa tìm tập tài liệu ứng cử và bài toán phát hiện đoạn
sao chép.
- Đã đề xuất phương pháp trích rút từ khóa tìm tập tài liệu ứng cử và hai
phương pháp phát hiện đoạn sao chép cho văn bản tiếng Anh. Thực hiện thực
nghiệm, so sánh và đánh giá hiệu quả của các phương pháp đề xuất so với các
tiếp cận trên thế giới liên quan đến mỗi bài toán.
- Đã đề xuất phương pháp trích rút từ khóa cho văn bản dài tiếng Việt. Cải
tiến các kỹ thuật đã đề xuất cho văn bản tiếng Anh ứng dụng cho văn bản tiếng
Việt.
- Đã đề xuất giải pháp và quy trình xây dựng kho ngữ liệu phát hiện đoạn
sao chép tiếng Việt phục vụ thử nghiệm, đánh giá các thuật toán phát hiện sao
chép cho văn bản tiếng Việt.
- Đã thu thập và xây dựng hai kho ngữ liệu tiếng Việt gồm kho ngữ liệu
bài báo và kho ngữ liệu ĐATN sử dụng cho bài toán trích rút từ khóa tiếng Việt
173 trang |
Chia sẻ: huydang97 | Lượt xem: 452 | Lượt tải: 0
Bạn đang xem trước 20 trang tài liệu Luận án Nghiên cứu phát triển một số kỹ thuật hỗ trợ phát hiện đạo văn và ứng dụng cho văn bản Tiếng Việt, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
e time.
Our approach to addressing the networking requirements for live
WAN migration builds on the observations that not all networking
changes in this approach are time critical and further that
instantaneous changes are best achieved in a localized manner.
Specifically, in our solution, described in detail in Section 3, we allow the
migration software to initiate the necessary networking changes as
soon as the need for migration has been identified. We make use
of tunneling technologies during this initial phase to preemptively
establish connectivity between the data centers involved. Once
server migration is complete, the migration software initiates a
local change to direct traffic towards the new data center via the
tunnel. Slower time scale network changes then phase out this local
network connectivity change for a more optimal network wide path
to the new data center.
2.3 Storage Replication Requirements
Data availability is typically addressed by replicating business
data on a local/primary storage system, to some remote location
from where it can be accessed. From a business/usability point of
view, such remote replication is driven by two metrics [9]. First
263
is the recovery-point-objective which is the consistent data point to
which data can be restored after a disaster. Second is the
recoverytime-objective which is the time it takes to recover to that consistent
data point after a disaster [13].
Remote replication can be broadly classified into the following
two categories:
¡ Synchronous replication: every data block written to a local
P.6
storage system is replicated to the remote location before the
local write operation returns.
¡ Asynchronous replication: in this case the local and remote
storage systems are allowed to diverge. The amount of
divergence between the local and remote copies is typically
bounded by either a certain amount of data, or by a certain
amount of time.
Synchronous replication is normally recommended for
applications, such as financial databases, where consistency between local
and remote storage systems is a high priority. However, these
desirable properties come at a price. First, because every data block
needs to be replicated remotely, synchronous replication systems
can not benefit from any local write coalescing of data if the same
data blocks are written repeatedly [16]. Second, because data have
to be copied to the remote location before the write operation
returns, synchronous replication has a direct performance impact on
the application, since both lower throughput and increased latency
of the path between the primary and the remote systems are
reflected in the time it takes for the local disk write to complete.
An alternative is to use asynchronous replication. However,
because the local and remote systems are allowed to diverge,
asynchronous replication always involves some data loss in the event
of a failure of the primary system. But, because write operations
can be batched and pipelined, asynchronous replication systems
can move data across the network in a much more efficient
manner than synchronous replication systems.
For WAN live server migration we seek a more flexible
replication system where the mode can be dictated by the migration
semantics. Specifically, to support live server migration we propose a
remote replication system where the initial transfer of data between
the data centers is performed via asynchronous replication to
benefit from the efficiency of that mode of operation. When the bulk of
the data have been transfered in this manner, replication switches
to synchronous replication in anticipation of the completion of the
server migration step. The final server migration step triggers a
simultaneous switch-over to the storage system at the new data
center. In this manner, when the virtual server starts executing in the
new data center, storage requirements can be locally met.
3. WAN MIGRATION SCENARIOS
In this section we illustrate how our cooperative, context aware
approach can combine the technical building blocks described in
the previous section to realize live server migration across a wide
area network. We demonstrate how the coordination of server
virtualization and migration technologies, the storage replication
P.7
subsystem and the network can achieve live migration of the entire data
center across the WAN. We utilize different scenarios to
demonstrate our approach. In Section 3.1 we outline how our approach
can be used to achieve the safe live migration of a data center when
planned maintenance events are handled. In Section 3.2 we show
the use of live server migration to mitigate the effects of unplanned
outages or failures.
3.1 Maintenance Outages
We deal with maintenance outages in two parts. First, we
consider the case where the service has no (or very limited) storage
requirements. This might for example be the case with a network
element such as a voice-over-IP (VoIP) gateway. Second, we deal
with the more general case where the service also requires the
migration of data storage to the new data center.
Without Requiring Storage to be Migrated: Without storage to
be replicated, the primary components that we need to coordinate
are the server migration and network mobility. Figure 1 shows
the environment where the application running in a virtual server
VS has to be moved from a physical server in data center A to a
physical server in data center B.
Prior to the maintenance event, the coordinating migration
management system (MMS) would signal to both the server
management system as well as the network that a migration is imminent.
The server management system would initiate the migration of the
virtual server from physical server a (¢¤£¦¥ ) to physical server b
(¢¤£¦§ ). After an initial bulk state transfer as preparation for
migration, the server management system will mirror any state changes
between the two virtual servers.
Similarly, for the network part, based on the signal received
from the MMS, the service provider edge (¢©¨ ) router will
initiate a number of steps to prepare for the migration. Specifically,
as shown in Figure 1(b), the migration system will cause the
network to create a tunnel between ¢©¨and ¢©¨which will be used
subsequently to transfer data destined to VS to data center B.
When the MMS determines a convenient point to quiesce the VS,
another signal is sent to both the server management system and
the network. For the server management system, this signal will
indicate the final migration of the VS from data center A to data
center B, i.e., after this the VS will become active in data center B.
For the network, this second signal enables the network data path to
switchover locally at ¢©¨©¥ to the remote data center. Specifically,
from this point in time, any traffic destined for the virtual server
address that arrives at ¢©¨©¥ will be switched onto the tunnel to
¢©¨©§ for delivery to data center B.
P.8
Note that at this point, from a server perspective the migration is
complete as the VS is now active in data center B. However, traffic
is sub-optimally flowing first to ¢©¨©¥ and then across the tunnel
to ¢©¨¤§ . To rectify this situation another networking step is
involved. Specifically, ¢©¨©§ starts to advertise a more preferred route
to reach VS, than the route currently being advertised by ¢©¨¤¥ . In
this manner, as ingress PEs to the network (¢©¨¤to ¢©¨¤ in
Figure 1) receive the more preferred route, traffic will start to flow to
¢©¨©§ directly and the tunnel between ¢©¨©¥ and ¢©¨©§ can be torn
down leading to the final state shown in Figure 1(c).
Requiring Storage Migration: When storage has to also be
replicated, it is critical that we achieve the right balance between
performance (impact on the application) and the recovery point or
data loss when the switchover occurs to the remote data center. To
achieve this, we allow the storage to be replicated asynchronously,
prior to any initiation of the maintenance event, or, assuming the
amount of data to be transfered is relatively small, asynchronous
replication can be started in anticipation of a migration that is
expected to happen shortly. Asynchronous replication during this
initial phase allows for the application to see no performance
impact. However, when the maintenance event is imminent, the MMS
would signal to the replication system to switch from asynchronous
replication to synchronous replication to ensure that there is no
loss of data during migration. When data is being replicated
synchronously, there will be a performance impact on the application.
264
Figure 1: Live server migration across a WAN
This requires us to keep the exposure to the amount of time we
replicate on a synchronous basis to a minimum.
When the MMS signals to the storage system the requirement
to switch to synchronous replication, the storage system completes
all the pending asynchronous operations and then proceeds to
perform all the subsequent writes by synchronously replicating it to
the remote data center. Thus, between the server migration and
synchronous replication, both the application state and all the
storage operations are mirrored at the two environments in the two data
centers. When all the pending write operations are copied over,
then as in the previous case, we quiesce the application and the
network is signaled to switch traffic over to the remote data center.
From this point on, both storage and server migration operations
are complete and activated in data center B. As above, the network
state still needs to be updated to ensure optimal data flow directly
to data center B.
Note that while we have described the live server migration
P.9
process as involving the service provider for the networking part, it
is possible for a data center provider to perform a similar set of
functions without involving the service provider. Specifically, by
creating a tunnel between the customer edge (CE) routers in the
data center, and performing local switching on the appropriate CE,
rather than on the PE, the data center provider can realize the same
functionality.
3.2 Unplanned Outages
We propose to also use cooperative, context aware migration to
deal with unplanned data center outages. There are multiple
considerations that go into managing data center operations to plan
and overcome failures through migration. Some of these are: (1)
amount of overhead under normal operation to overcome
anticipated failures; (2) amount of data loss affordable (recovery point
objective - RPO); (3) amount of state that has to be migrated; and
(4) time available from anticipated failure to occurrence of event.
At the one extreme, one might incur the overhead of completely
mirroring the application at the remote site. This has the
consequence of both incurring processing and network overhead under
normal operation as well as impacting application performance
(latency and throughput) throughout. The other extreme is to only
ensure data recovery and to start a new copy of the application at the
remote site after an outage. In this case, application memory state
such as ongoing sessions are lost, but data stored on disk is
replicated and available in a consistent state. Neither this hot standby
nor the cold standby approach described are desirable due to the
overhead or the loss of application memory state.
An intermediate approach is to recover control and essential state
of the application, in addition to data stored on disk, to further
minimize disruptions to users. A spectrum of approaches are possible.
In a VoIP server, for instance, session-based information can be
mirrored without mirroring the data flowing through each session.
More generally, this points to the need to checkpoint some
application state in addition to mirroring data on disk. Checkpointing
application state involves storing application state either periodically
or in an application-aware manner like databases do and then
copying it to the remote site. Of course, this has the consequence that
the application can be restarted remotely at the checkpoint
boundary only. Similarly, for storage one may use asynchronous
replication with a periodic snapshot ensuring all writes are up-to-date
at the remote site at the time of checkpointing. Some data loss
may occur upon an unanticipated, catastrophic failure, but the
recovery point may be fairly small, depending on the frequency of
checkpointing application and storage state. Coordination between
P.10
265
the checkpointing of the application state and the snapshot of
storage is key to successful migration while meeting the desired RPOs.
Incremental checkpointing of application and storage is key to
efficiency, and we see existing techniques to achieve this [4, 3, 11].
For instance, rather than full application mirroring, a virtualized
replica can be maintained as a warm standby-in dormant or
hibernating state-enabling a quick switch-over to the previously
checkpointed state. To make the switch-over seamless, in addition
to replicating data and recovering state, network support is needed.
Specifically, on detecting the unavailability of the primary site, the
secondary site is made active, and the same mechanism described
in Section 3.1 is used to switch traffic over to reach the secondary
site via the pre-established tunnel. Note that for simplicity of
exposition we assume here that the PE that performs the local switch
over is not affected by the failure. The approach can however,
easily be extended to make use of a switchover at a router deeper in
the network.
The amount of state and storage that has to be migrated may vary
widely from application to application. There may be many
situations where, in principle, the server can be stateless. For example,
a SIP proxy server may not have any persistent state and the
communication between the clients and the proxy server may be using
UDP. In such a case, the primary activity to be performed is in
the network to move the communication over to the new data
center site. Little or no overhead is incurred under normal operation to
enable the migration to a new data center. Failure recovery involves
no data loss and we can deal with near instantaneous, catastrophic
failures.
As more and more state is involved with the server, more
overhead is incurred to checkpoint application state and potentially
to take storage snapshots, either periodically or upon application
prompting. It also means that the RPO is a function of the
interval between checkpoints, when we have to deal with instantaneous
failures. The more advanced information we have of an impending
failure, the more effective we can be in having the state migrated
over to the new data center, so that we can still have a tighter RPO
when operations are resumed at the new site.
4. RELATED WORK
Prior work on this topic falls into several categories: virtual
machine migration, storage replication and network support.
At the core of our technique is the ability of encapsulate
applications within virtual machines that can be migrated without
application downtimes [15]. Most virtual machine software, such as Xen
P.11
[8] and VMWare [14] support live migration of VMs that involve
extremely short downtimes ranging from tens of milliseconds to a
second; details of Xen"s live migration techniques are discussed in
[8]. As indicated earlier, these techniques assume that migration is
being done on a LAN. VM migration has also been studied in the
Shirako system [10] and for grid environments [17, 19].
Current virtual machine software support a suspend and resume
feature that can be used to support WAN migration, but with
downtimes [18, 12]. Recently live WAN migration using IP tunnels was
demonstrated in [21], where an IP tunnel is set up from the source
to destination server to transparently forward packets to and from
the application; we advocate an alternate approach that assumes
edge router support.
In the context of storage, there exist numerous commercial
products that perform replication, such as IBM Extended Remote Copy,
HP Continuous Access XP, and EMC RepliStor. An excellent
description of these and others, as well as a detailed taxonomy of the
different approaches for replication can be found in [11]. The Ursa
Minor system argues that no single fault model is optimal for all
applications and proposed supporting data-type specific selections
of fault models and encoding schemes for replication [1]. Recently,
we proposed the notion of semantic-aware replication [13] where
the system supports both synchronous and asynchronous
replication concurrently and use signals from the file system to
determine whether to replicate a particular write synchronously and
asynchronously.
In the context of network support, our work is related to the
RouterFarm approach [2], which makes use of orchestrated
network changes to realize near hitless maintenance on provider edge
routers. In addition to being in a different application area, our
approach differs from the RouterFarm work in two regards. First,
we propose to have the required network changes be triggered by
functionality outside of the network (as opposed to network
management functions inside the network). Second, due to the stringent
timing requirements of live migration, we expect that our approach
would require new router functionality (as opposed to being
realizable via the existing configuration interfaces).
Finally, the recovery oriented computing (ROC) work
emphasizes recovery from failures rather than failure avoidance [6]. In a
similar spirit to ROC, we advocate using mechanisms from live VM
migration to storage replication to support planned and unplanned
outages in data centers (rather than full replication to mask such
failures).
5. CONCLUSION
P.12
A significant concern for Internet-based service providers is the
continued operation and availability of services in the face of
outages, whether planned or unplanned. In this paper we advocated
a cooperative, context-aware approach to data center migration
across WANs to deal with outages in a non-disruptive manner. We
sought to achieve high availability of data center services in the
face of both planned and incidental outages of data center
facilities. We advocated using server virtualization technologies to
enable the replication and migration of server functions. We proposed
new network functions to enable server migration and replication
across wide area networks (such as the Internet or a geographically
distributed virtual private network), and finally showed the utility
of intelligent and dynamic storage replication technology to ensure
applications have access to data in the face of outages with very
tight recovery point objectives.
6. REFERENCES
[1] M. Abd-El-Malek, W. V. Courtright II, C. Cranor, G. R.
Ganger, J. Hendricks, A. J. Klosterman, M. Mesnier,
M. Prasad, B. Salmon, R. R. Sambasivan, S. Sinnamohideen,
J. D. Strunk, E. Thereska, M. Wachs, and J. J. Wylie. Ursa
minor: versatile cluster-based storage. USENIX Conference
on File and Storage Technologies, December 2005.
[2] Mukesh Agrawal, Susan Bailey, Albert Greenberg, Jorge
Pastor, Panagiotis Sebos, Srinivasan Seshan, Kobus van der
Merwe, and Jennifer Yates. Routerfarm: Towards a dynamic,
manageable network edge. SIGCOMM Workshop on
Internet Network Management (INM), September 2006.
[3] L. Alvisi. Understanding the Message Logging Paradigm for
Masking Process Crashes. PhD thesis, Cornell, January
1996.
[4] L. Alvisi and K. Marzullo. Message logging: Pessimistic,
optimistic, and causal. In Proceedings of the 15th
International Conference on Distributed Computing Systems,
pages 229-236. IEEE Computer Society, June 1995.
266
[5] Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim
Harris, Alex Ho, Rolf Neugebar, Ian Pratt, and Andrew
Warfield. Xen and the art of virtualization. In the
Proceedings of the ACM Symposium on Operating Systems
Principles (SOSP), October 2003.
[6] A. Brown and D. A. Patterson. Embracing failure: A case for
recovery-oriented computing (roc). 2001 High Performance
Transaction Processing Symposium, October 2001.
[7] K. Brown, J. Katcher, R. Walters, and A. Watson. Snapmirror
P.13
and snaprestore: Advances in snapshot technology. Network
Appliance Technical Report TR3043.
www. ne t app. c om/t e c h_ l i br ar y/3043. ht ml .
[8] C. Clark, K. Fraser, S. Hand, J. Hanse, E. Jul, C. Limpach,
I. Pratt, and A. Warfiel. Live migration of virtual machines.
In Proceedings of NSDI, May 2005.
[9] Disaster Recovery Journal. Business continuity glossary.
ht t p:
//www. dr j . c om/gl os s ar y/dr j gl os s ar y. ht ml .
[10] Laura Grit, David Irwin, , Aydan Yumerefendi, and Jeff
Chase. Virtual machine hosting for networked clusters:
Building the foundations for autonomic orchestration. In In
the First International Workshop on Virtualization
Technology in Distributed Computing (VTDC), November
2006.
[11] M. Ji, A. Veitch, and J. Wilkes. Seneca: Remote mirroring
done write. USENIX 2003 Annual Technical Conference,
June 2003.
[12] M. Kozuch and M. Satyanarayanan. Internet suspend and
resume. In Proceedings of the Fourth IEEE Workshop on
Mobile Computing Systems and Applications, Calicoon, NY,
June 2002.
[13] Xiaotao Liu, Gal Niv, K. K. Ramakrishnan, Prashant Shenoy,
and Jacobus Van der Merwe. The case for semantic aware
remote replication. In Proc. 2nd International Workshop on
Storage Security and Survivability (StorageSS 2006),
Alexandria, VA, October 2006.
[14] Michael Nelson, Beng-Hong Lim, and Greg Hutchins. Fast
Transparent Migration for Virtual Machines. In USENIX
Annual Technical Conference, 2005.
[15] Mendel Rosenblum and Tal Garfinkel. Virtual machine
monitors: Current technology and future trends. Computer,
38(5):39-47, 2005.
[16] C. Ruemmler and J. Wilkes. Unix disk access patterns.
Proceedings of Winter 1993 USENIX, Jan 1993.
[17] Paul Ruth, Junghwan Rhee, Dongyan Xu, Rick Kennell, and
Sebastien Goasguen. Autonomic Live Adaptation of Virtual
Computational Environments in a Multi-Domain
Infrastructure. In IEEE International Conference on
Autonomic Computing (ICAC), June 2006.
[18] Constantine P. Sapuntzakis, Ramesh Chandra, Ben Pfaff, Jim
Chow, Monica S. Lam, and Mendel Rosenblum. Optimizing
the migration of virtual computers. In Proceedings of the 5th
Symposium on Operating Systems Design and
P.14
Implementation, December 2002.
[19] A. Sundararaj, A. Gupta, and P. Dinda. Increasing
Application Performance in Virtual Environments through
Run-time Inference and Adaptation. In Fourteenth
International Symposium on High Performance Distributed
Computing (HPDC), July 2005.
[20] Symantec Corporation. Veritas Volume Replicator
Administrator"s Guide. ht t p:
//f t p. s uppor t . ve r i t as . c om/pub/s uppor t /
pr oduc t s /Vol ume _ Re pl i c at or /2%83842. pdf ,
5.0 edition, 2006.
[21] F. Travostino, P. Daspit, L. Gommans, C. Jog, C. de Laat,
J. Mambretti, I. Monga, B. van Oudenaarde, S. Raghunath,
and P. Wang. Seamless live migration of virtual machines
over the man/wan. Elsevier Future Generations Computer
Systems, 2006.
[22] T. Wood, P. Shenoy, A. Venkataramani, and M. Yousif.
Black-box and gray-box strategies for virtual machine
migration. In Proceedings of the Usenix Symposium on
Networked System Design and Implementation (NSDI),
Cambridge, MA, April 2007.
[23] A xen way to iscsi virtualization?
2. Tập từ khóa xác định trước
Tệp C-20.key thuộc kho ngữ liệu SemEval2010 gồm các từ khóa do tác giả
xác định:
internetbased, service, data, center, migration, wan, lan, virtual, server,
storage, replication, synchronous, replication, asynchronous, replication, network,
support, storage, voiceoverip, voip, database
3. Tập từ khóa ứng cử
Kết quả trích rút tập từ khóa ứng cử lấy ra từ các Cụm danh từ, Thực thể có
tên và các cụm ba từ xuất hiện lặp lại nhiều lần:
migration, replication, virtualization, server, live, outages, center, storage,
network, virtual, recovery, technologies, data, ramakrishnan, internetbased,
application, remote, cooperative, wan, availability, maintenance, services,
applications, service, technology, aware, distributed, operation, contextaware,
management, approach, continued, centers, context, unplanned, access, providers,
P.15
requirements, networks, significant, nondisruptive, business, systems, facilities,
shenoy, prashant, intelligent, face, design, advances, objectives, disaster, manner,
concern, categories, unanticipated, functions, wide, lan, high, minor, wans,
propose, area, computercommunication, dynamic, internet, environment,
mechanisms, physical, der, utility, disk, local, networking, functionality, failures,
van, asynchronous, tight, ongoing, point, synchronous, jacobus, support, paper,
disruptions, catastrophic, effective, current, servers, particular, critical, seamless,
connectivity, administrator, tunnel, merwe, subsystems, environments, use,
entertainment, operations, general, new, balance, labsresearch, outage,
applicationservice, first, operating, second, available, address, users, essential,
technical, location, allow, servicesapplications, realtime, building, events,
continuous, instantaneous, introduction, milliseconds, sophisticated, reliability,
semantics, descriptors, different, number, university, mirroring, machine,
components, interactions, frequency, configuration, state, similar, private, system,
techniques, abstract, downtime, amount, prior, concerns, underlying, connections,
framework, work, changes, same, mission, ip, recent, robust, massachusetts,
reasons, tool, contribution, coordinate, fashion, viewpoint, reachability, multiple,
level, terms, redundancy, subject, example, failure, presents, challenges,
performance, blocks, requirement, disruption, provider, section, http, software,
platforms, extensions, write, networkbased, hotswappable, active, traffic, case,
redundant, kk, such, today, throughput, robustness, running, shared, practices,
knowledge, little, essence, appropriate, capabilities, businessusability, large,
individual, needs, tens, consistent, es, addresses, unsolicited, cognizant,
implication, necessity, complete, supplies, ability, load, router, nature, process,
towards, databases, eg, primary, coordination, becomes, mobility, replicas, unique,
experience, checkpoint, optimal, devices, efficiency, desirable, entirety, varies,
attractive, difficulty, unresolved, anticipation, entire, san, switchover,
observations, convergence, feat, hundreds, logic, constraints, applicationaware,
way, main, simultaneous, means, unavailability, localprimary, reason, coalescing,
challenge, other, considerations, actions, part, driver, initial, consistency, binding,
latency, overhead, extension, whole, desire, weight, switch, latter, preparation,
signal, snapshot, anticipated, actual, subset, divergence, space, operators, transfer,
localized, enabler, alternative, processor, sessionbased, heavy, protocols,
layertwo, discussion, power, phase, vpns, decades, focus, purposes, kind, routers,
completion, loss, basic, preestablished, routed, event, site, signals, properties,
P.16
consequence, key, recoveryoriented, financial, edge, necessary, communication,
impact, perspective, intermediate, virtualized, subsystem, order, convenient, time,
previous, switches, routing, orchestrated, reply, approaches, checkpoints, cases,
suspend, bulk, information, encapsulate, preferred, modern, efficient, solution,
mac, priority, concert, nonlan, manual, years, clusterbased, sans, returns, flexible,
need, scales, semanticaware, problem, secondary, occurrence, path, initiation,
subsequent, builds, writes, direct, status, metrics, starts, forms, change, situation,
exposition, addition, selections, scenarios, resume, gateway, issues, disks,
voiceoverip, persistent, affordable, certain, delivery, exposure, copies, successful,
block, detail, simplicity, ie, orchestration, snapshots, situations, description,
incidental, final, emphasizes, sessions, normal, element, proceeds, checkpointing,
effects, customer, periodic, route, commercial, stringent, interfaces, extreme,
spectrum, instance, over, blades, principle, processing, foundations, avoidance,
arp, scale, manageable, mechanism, datatype, alternate, standbyin, continuity,
excellent, snaprestore, technique, versatile, mode, step, downtimes, price, ingress,
minimum, fact, prompting, replica, impending, activity, session, taxonomy,
advanced, memory, interval, products, numerous, parts, safe, control, view, clients,
standby, function, detailed, figure, steps, regards, strategies, dormant, specific,
deeper, computers, vs, proxy, monitors, fault, clusters, basis, right, machines,
course, details, feature, points, schemes, quick, cold, voip, differs, timing, semantic,
incremental, several, tunnels, graybox, patterns, full, hitless, ht, glossary, spirit,
causal, international, notion, source, usenix, set, models, others, edition, thesis,
topic, ar, single, note, hot, survivability, short, proceedings, trends, autonomic,
implementation, pe, grid, warm, core, infrastructure, venkataramani, model, copy,
computing, many, 15th, symposium, ml, ce, pages, manwan, computational,
routerfarm, conference, y3043, multidomain, transaction, www, os, uppor,
transparent, generations, networked, conclusion, ne, december, gl, replicator, art,
pdf, adaptation, principles, logging, roc, pes, replistor, 5th, fourteenth, abdelmalek,
garfinkel, workshop, dr, related, sebastien, ydr, references, computer, pl, pr,
paradigm, crashes, sundararaj, inference, agrawal, appliance, travostino,
oudenaarde, cornell, sigcomm, thereska, message, blackbox, prasad, seshan, rpos,
ve, security, dragovic, vm, raghunath, society, file, calicoon, mesnier, benghong,
june, goasguen, nelson, journal, runtime, winter, annual, cranor, sip, pastor,
dongyan, barham, fourth, ieee, report, wilkes, tr3043, future, de, xp, sosp, seneca,
volume, unix, wachs, gupta, icac, guide, yousif, wood, gal, mms, phd, dinda, proc,
P.17
lim, vtdc, wang, liu, jan, inm, rhee, andrew, hpdc, ii, va, sinnamohideen, niv,
monica, xu, tal, ny, satyanarayanan, hp, kobus, re, keir, alexandria, srinivasan,
rosenblum, hendricks, storagess, laat, marzullo, courtright, shirako, klosterman,
sambasivan, panagiotis, vmware, yumerefendi, corporation, snapmirror, xen,
greenberg, bailey, hutchins, constantine, sapuntzakis, patterson, albert, app,
harris, jennifer, nsdi, steven, neugebar, warfield, ursa, watson, ibm, paul, susan,
chandra, kozuch, brown, vms, ganger, salmon, strunk, katcher, walters, irwin,
clark, mukesh, alvisi, limpach, warfiel, emc, fraser, alex, michael, ruemmler,
mambretti, chase, junghwan, ramesh, wylie, xiaotao, jorge, sebos, yates, symantec,
pratt, cambridge, boris, jeff, tim, kennell, ian, david, oduc, veitch, hand, ho, rick,
mendel, hanse, veritas, gommans, laura, aydan, rolf, daspit, grit, acm, lam, ben,
jim, pfaff, jul, greg, ji, monga, ruth, chow, vol, jog, ume, ma, rpo
4. Kết quả xác định độ quan trọng của từ
Kết quả dự đoán độ quan trọng của mỗi từ trong tập từ khóa ứng cử và sắp
xếp giảm dần theo độ quan trọng (Các từ in đậm trong 10 kết quả đầu tiên nằm
trong tập từ khóa xác định trước).
Bảng P1. Kết quả dự đoán độ quan trọng của từ
STT Từ khóa ứng cử Giá trị y STT Từ khóa ứng cử Giá trị y
1 migration 0.840984 382 resume 0.0045808
2 replication 0.678164 383 gateway 0.0045794
3 virtualization 0.659803 384 issues 0.004577
4 server 0.618527 385 disks 0.0045655
5 live 0.61789 386 voiceoverip 0.0045531
6 outages 0.610963 387 persistent 0.0045473
7 center 0.606313 388 affordable 0.004527
8 storage 0.542996 389 certain 0.0045241
9 network 0.533428 390 delivery 0.0045182
10 virtual 0.490329 391 exposure 0.0045027
11 recovery 0.462664 392 copies 0.0044478
12 technologies 0.439565 393 successful 0.0044444
13 data 0.439148 394 block 0.0044061
14 ramakrishnan 0.371115 395 detail 0.0043981
15 internetbased 0.363683 396 simplicity 0.0043912
16 application 0.32836 397 ie 0.0043785
17 remote 0.311041 398 orchestration 0.0043709
18 cooperative 0.298606 399 snapshots 0.00437
19 wan 0.260697 400 situations 0.0043109
20 availability 0.249949 401 description 0.0042526
21 maintenance 0.246622 402 incidental 0.0042457
P.18
STT Từ khóa ứng cử Giá trị y STT Từ khóa ứng cử Giá trị y
22 services 0.236617 403 final 0.0042226
23 applications 0.21758 404 emphasizes 0.0042183
24 service 0.208958 405 sessions 0.0041841
25 technology 0.188583 406 normal 0.0041711
26 aware 0.184637 407 element 0.0041647
27 distributed 0.184114 408 proceeds 0.0041553
28 operation 0.178693 409 checkpointing 0.0041511
29 contextaware 0.174755 410 effects 0.0041382
30 management 0.16334 411 customer 0.00413
31 approach 0.160562 412 periodic 0.0040518
32 continued 0.159632 413 route 0.0040377
33 centers 0.151135 414 commercial 0.0040179
34 context 0.149825 415 stringent 0.0039577
35 unplanned 0.136594 416 interfaces 0.0039557
36 access 0.135795 417 extreme 0.0039349
37 providers 0.132818 418 spectrum 0.0039304
38 requirements 0.132789 419 instance 0.0039263
39 networks 0.131715 420 over 0.003924
40 significant 0.129641 421 blades 0.0039219
41 nondisruptive 0.127865 422 principle 0.0038941
42 business 0.126391 423 processing 0.0038913
43 systems 0.123377 424 foundations 0.0038743
44 facilities 0.122692 425 avoidance 0.0038511
45 shenoy 0.113144 426 arp 0.0038416
46 prashant 0.109356 427 scale 0.0038255
47 intelligent 0.107079 428 manageable 0.0038238
48 face 0.10703 429 mechanism 0.0038141
49 design 0.10576 430 datatype 0.0037792
50 advances 0.105361 431 alternate 0.0037481
51 objectives 0.103294 432 standbyin 0.0037446
52 disaster 0.102727 433 continuity 0.0037397
53 manner 0.102325 434 excellent 0.0037301
54 concern 0.101271 435 snaprestore 0.0036539
55 categories 0.098923 436 technique 0.0036314
56 unanticipated 0.097049 437 versatile 0.0036119
57 functions 0.097024 438 mode 0.0036081
58 wide 0.094731 439 step 0.0036024
59 lan 0.09354 440 downtimes 0.0035835
60 high 0.091383 441 price 0.0035753
61 minor 0.087343 442 ingress 0.0035746
62 wans 0.081532 443 minimum 0.0035724
63 propose 0.081235 444 fact 0.0035698
P.19
STT Từ khóa ứng cử Giá trị y STT Từ khóa ứng cử Giá trị y
64 area 0.075401 445 prompting 0.0035426
65 computercommunication 0.071124 446 replica 0.0035229
66 dynamic 0.070823 447 impending 0.0035184
67 internet 0.070188 448 activity 0.003507
68 environment 0.07006 449 session 0.0034768
69 mechanisms 0.068699 450 taxonomy 0.0034586
70 physical 0.067211 451 advanced 0.0033992
71 der 0.065018 452 memory 0.0033911
72 utility 0.064928 453 interval 0.0033575
73 disk 0.063486 454 products 0.0033335
74 local 0.062574 455 numerous 0.0032772
75 networking 0.062066 456 parts 0.0032592
76 functionality 0.060316 457 safe 0.0032163
77 failures 0.058608 458 control 0.0031983
78 van 0.056861 459 view 0.0031885
79 asynchronous 0.055627 460 clients 0.0031725
80 tight 0.055299 461 standby 0.0031548
81 ongoing 0.055233 462 function 0.0031449
82 point 0.055094 463 detailed 0.0031171
83 synchronous 0.055003 464 figure 0.0030952
84 jacobus 0.054747 465 steps 0.0030943
85 support 0.05382 466 regards 0.0030621
86 paper 0.053773 467 strategies 0.0030361
87 disruptions 0.053526 468 dormant 0.0029639
88 catastrophic 0.052759 469 specific 0.0029358
89 effective 0.052542 470 deeper 0.0029295
90 current 0.051912 471 computers 0.0029269
91 servers 0.049387 472 vs 0.0029162
92 particular 0.049132 473 proxy 0.0029011
93 critical 0.047906 474 monitors 0.0028663
94 seamless 0.046336 475 fault 0.00285
95 connectivity 0.044587 476 clusters 0.002828
96 administrator 0.043383 477 basis 0.002826
97 tunnel 0.04324 478 right 0.0028248
98 merwe 0.041538 479 machines 0.0028083
99 subsystems 0.041409 480 course 0.0028024
100 environments 0.040257 481 details 0.0027973
101 use 0.036387 482 feature 0.0027855
102 entertainment 0.036088 483 points 0.0027779
103 operations 0.03538 484 schemes 0.0027776
104 general 0.034748 485 quick 0.0027557
105 new 0.034575 486 cold 0.0027474
P.20
STT Từ khóa ứng cử Giá trị y STT Từ khóa ứng cử Giá trị y
106 balance 0.034293 487 voip 0.0027359
107 labsresearch 0.034123 488 differs 0.0027316
108 outage 0.032523 489 timing 0.0026993
109 applicationservice 0.0323 490 semantic 0.0026863
110 first 0.032053 491 incremental 0.0026836
111 operating 0.031873 492 several 0.002665
112 second 0.031871 493 tunnels 0.0026577
113 available 0.031776 494 graybox 0.0026532
114 address 0.031016 495 patterns 0.0025977
115 users 0.030828 496 full 0.0025534
116 essential 0.030411 497 hitless 0.0025488
117 technical 0.030131 498 ht 0.0025404
118 location 0.029677 499 glossary 0.0025208
119 allow 0.029313 500 spirit 0.002509
120 servicesapplications 0.029243 501 causal 0.0024637
121 realtime 0.029228 502 international 0.002438
122 building 0.029168 503 notion 0.0024247
123 events 0.028764 504 source 0.0024038
124 continuous 0.028121 505 usenix 0.0023922
125 instantaneous 0.027937 506 set 0.0023618
126 introduction 0.027788 507 models 0.0023371
127 milliseconds 0.027768 508 others 0.0023345
128 sophisticated 0.027693 509 edition 0.0022988
129 reliability 0.027484 510 thesis 0.0022935
130 semantics 0.02676 511 topic 0.0022763
131 descriptors 0.025592 512 ar 0.0022534
132 different 0.025459 513 single 0.0022509
133 number 0.025158 514 note 0.0022465
134 university 0.024702 515 hot 0.0022409
135 mirroring 0.024501 516 survivability 0.0022022
136 machine 0.024371 517 short 0.0021964
137 components 0.024161 518 proceedings 0.0021695
138 interactions 0.024085 519 trends 0.0021362
139 frequency 0.023768 520 autonomic 0.0021327
140 configuration 0.023489 521 implementation 0.002083
141 state 0.023362 522 pe 0.0020795
142 similar 0.023309 523 grid 0.0020748
143 private 0.023003 524 warm 0.002055
144 system 0.022504 525 core 0.0020492
145 techniques 0.021737 526 infrastructure 0.0020421
146 abstract 0.021286 527 venkataramani 0.0020299
147 downtime 0.021142 528 model 0.0020187
P.21
STT Từ khóa ứng cử Giá trị y STT Từ khóa ứng cử Giá trị y
148 amount 0.020983 529 copy 0.00194
149 prior 0.020438 530 computing 0.0018993
150 concerns 0.020368 531 many 0.0018784
151 underlying 0.020181 532 15th 0.0018723
152 connections 0.020169 533 symposium 0.0018621
153 framework 0.019967 534 ml 0.0018518
154 work 0.019943 535 ce 0.0018451
155 changes 0.019746 536 pages 0.0017967
156 same 0.019699 537 manwan 0.0017894
157 mission 0.019429 538 computational 0.0017754
158 ip 0.019258 539 routerfarm 0.0017595
159 recent 0.018481 540 conference 0.0017545
160 robust 0.018437 541 y3043 0.00175
161 massachusetts 0.018398 542 multidomain 0.0017365
162 reasons 0.018173 543 transaction 0.0017094
163 tool 0.018069 544 www 0.0016864
164 contribution 0.017874 545 os 0.0016773
165 coordinate 0.017508 546 uppor 0.0016162
166 fashion 0.01731 547 transparent 0.0016075
167 viewpoint 0.017007 548 generations 0.0015975
168 reachability 0.016715 549 networked 0.0015931
169 multiple 0.016661 550 conclusion 0.0015557
170 level 0.016497 551 ne 0.0015469
171 terms 0.016478 552 december 0.0015402
172 redundancy 0.016408 553 gl 0.0015227
173 subject 0.015692 554 replicator 0.0015208
174 example 0.015675 555 art 0.0015121
175 failure 0.015571 556 pdf 0.0015027
176 presents 0.015501 557 adaptation 0.001501
177 challenges 0.015161 558 principles 0.0014718
178 performance 0.014878 559 logging 0.0014657
179 blocks 0.014839 560 roc 0.0014649
180 requirement 0.014562 561 pes 0.0014648
181 disruption 0.014552 562 replistor 0.0014273
182 provider 0.014527 563 5th 0.0014269
183 section 0.014516 564 fourteenth 0.0014263
184 http 0.014488 565 abdelmalek 0.0014201
185 software 0.01448 566 garfinkel 0.0014167
186 platforms 0.014328 567 workshop 0.0014111
187 extensions 0.014232 568 dr 0.0013682
188 write 0.014197 569 related 0.0013521
189 networkbased 0.014171 570 sebastien 0.0013487
P.22
STT Từ khóa ứng cử Giá trị y STT Từ khóa ứng cử Giá trị y
190 hotswappable 0.013964 571 ydr 0.0013394
191 active 0.01382 572 references 0.0013047
192 traffic 0.013746 573 computer 0.0012926
193 case 0.013394 574 pl 0.0012628
194 redundant 0.013225 575 pr 0.0012412
195 kk 0.012854 576 paradigm 0.0012372
196 such 0.01283 577 crashes 0.0012056
197 today 0.01281 578 sundararaj 0.001189
198 throughput 0.012766 579 inference 0.001188
199 robustness 0.012753 580 agrawal 0.0011803
200 running 0.012621 581 appliance 0.001175
201 shared 0.012589 582 travostino 0.0011587
202 practices 0.012569 583 oudenaarde 0.0011525
203 knowledge 0.012438 584 cornell 0.0011472
204 little 0.012241 585 sigcomm 0.00111
205 essence 0.012232 586 thereska 0.0011092
206 appropriate 0.012105 587 message 0.0011037
207 capabilities 0.012089 588 blackbox 0.0011029
208 businessusability 0.011849 589 prasad 0.0010992
209 large 0.011811 590 seshan 0.0010836
210 individual 0.011753 591 rpos 0.0010718
211 needs 0.011703 592 ve 0.0010707
212 tens 0.01148 593 security 0.0010704
213 consistent 0.011461 594 dragovic 0.0010703
214 es 0.011417 595 vm 0.001057
215 addresses 0.01139 596 raghunath 0.0010229
216 unsolicited 0.011261 597 society 0.001008
217 cognizant 0.011184 598 file 0.0010009
218 implication 0.011171 599 calicoon 0.0009904
219 necessity 0.011126 600 mesnier 0.0009895
220 complete 0.011042 601 benghong 0.000976
221 supplies 0.010953 602 june 0.0009648
222 ability 0.010771 603 goasguen 0.0009563
223 load 0.010654 604 nelson 0.0009411
224 router 0.010647 605 journal 0.0009258
225 nature 0.010615 606 runtime 0.000911
226 process 0.010454 607 winter 0.0009097
227 towards 0.009911 608 annual 0.0008994
228 databases 0.009863 609 cranor 0.0008817
229 eg 0.009755 610 sip 0.0008663
230 primary 0.009714 611 pastor 0.0008657
231 coordination 0.009567 612 dongyan 0.000851
P.23
STT Từ khóa ứng cử Giá trị y STT Từ khóa ứng cử Giá trị y
232 becomes 0.009501 613 barham 0.0008447
233 mobility 0.009493 614 fourth 0.0008392
234 replicas 0.009441 615 ieee 0.0008385
235 unique 0.009179 616 report 0.000827
236 experience 0.009144 617 wilkes 0.0008237
237 checkpoint 0.009134 618 tr3043 0.0008223
238 optimal 0.009079 619 future 0.0008174
239 devices 0.009068 620 de 0.0008126
240 efficiency 0.009034 621 xp 0.0007982
241 desirable 0.009008 622 sosp 0.0007951
242 entirety 0.008996 623 seneca 0.0007899
243 varies 0.008913 624 volume 0.0007782
244 attractive 0.008811 625 unix 0.0007777
245 difficulty 0.008741 626 wachs 0.0007764
246 unresolved 0.008637 627 gupta 0.0007693
247 anticipation 0.008615 628 icac 0.0007677
248 entire 0.008518 629 guide 0.0007181
249 san 0.008446 630 yousif 0.0007099
250 switchover 0.008434 631 wood 0.0006844
251 observations 0.008378 632 gal 0.0006833
252 convergence 0.008318 633 mms 0.0006742
253 feat 0.007965 634 phd 0.0006715
254 hundreds 0.007892 635 dinda 0.0006558
255 logic 0.00782 636 proc 0.0006469
256 constraints 0.007783 637 lim 0.0006428
257 applicationaware 0.007772 638 vtdc 0.0006254
258 way 0.00776 639 wang 0.0006206
259 main 0.007727 640 liu 0.000605
260 simultaneous 0.007708 641 jan 0.0006041
261 means 0.007664 642 inm 0.000602
262 unavailability 0.007623 643 rhee 0.0005962
263 localprimary 0.007621 644 andrew 0.0005909
264 reason 0.007501 645 hpdc 0.0005783
265 coalescing 0.00743 646 ii 0.0005641
266 challenge 0.007417 647 va 0.0005618
267 other 0.007394 648 sinnamohideen 0.0005494
268 considerations 0.00738 649 niv 0.0005456
269 actions 0.007338 650 monica 0.0005414
270 part 0.007235 651 xu 0.0005404
271 driver 0.007214 652 tal 0.0005352
272 initial 0.007191 653 ny 0.0005255
273 consistency 0.007187 654 satyanarayanan 0.0004802
P.24
STT Từ khóa ứng cử Giá trị y STT Từ khóa ứng cử Giá trị y
274 binding 0.007176 655 hp 0.0004593
275 latency 0.007136 656 kobus 0.0004488
276 overhead 0.007092 657 re 0.0004384
277 extension 0.007062 658 keir 0.0003816
278 whole 0.007004 659 alexandria 0.0003689
279 desire 0.006951 660 srinivasan 0.0003569
280 weight 0.006927 661 rosenblum 0.0003449
281 switch 0.006921 662 hendricks 0.0003437
282 latter 0.006908 663 storagess 0.0003361
283 preparation 0.006881 664 laat 0.0002986
284 signal 0.006875 665 marzullo 0.0002801
285 snapshot 0.006874 666 courtright 0.0002798
286 anticipated 0.006833 667 shirako 0.0002795
287 actual 0.006811 668 klosterman 0.0002775
288 subset 0.006804 669 sambasivan 0.0002757
289 divergence 0.006801 670 panagiotis 0.0002679
290 space 0.006789 671 vmware 0.0002562
291 operators 0.006741 672 yumerefendi 0.0002521
292 transfer 0.006687 673 corporation 0.0002517
293 localized 0.006684 674 snapmirror 0.0002388
294 enabler 0.006667 675 xen 0.0002384
295 alternative 0.006654 676 greenberg 0.0002366
296 processor 0.00665 677 bailey 0.0002238
297 sessionbased 0.006645 678 hutchins 0.000217
298 heavy 0.00664 679 constantine 0.0002148
299 protocols 0.006616 680 sapuntzakis 0.0002146
300 layertwo 0.006604 681 patterson 0.000214
301 discussion 0.006598 682 albert 0.0002124
302 power 0.006525 683 app 0.0002115
303 phase 0.006509 684 harris 0.0002076
304 vpns 0.006507 685 jennifer 0.0002048
305 decades 0.006464 686 nsdi 0.0001985
306 focus 0.006448 687 steven 0.0001931
307 purposes 0.006413 688 neugebar 0.0001924
308 kind 0.006337 689 warfield 0.0001916
309 routers 0.006307 690 ursa 0.0001906
310 completion 0.006291 691 watson 0.0001886
311 loss 0.006257 692 ibm 0.0001831
312 basic 0.00625 693 paul 0.0001793
313 preestablished 0.006211 694 susan 0.0001779
314 routed 0.006206 695 chandra 0.0001771
315 event 0.006204 696 kozuch 0.0001758
P.25
STT Từ khóa ứng cử Giá trị y STT Từ khóa ứng cử Giá trị y
316 site 0.006195 697 brown 0.0001757
317 signals 0.006155 698 vms 0.0001715
318 properties 0.006106 699 ganger 0.000165
319 consequence 0.006096 700 salmon 0.0001633
320 key 0.006056 701 strunk 0.0001625
321 recoveryoriented 0.006006 702 katcher 0.0001613
322 financial 0.005989 703 walters 0.000161
323 edge 0.00598 704 irwin 0.0001609
324 necessary 0.00597 705 clark 0.0001604
325 communication 0.005953 706 mukesh 0.0001593
326 impact 0.005874 707 alvisi 0.0001563
327 perspective 0.005872 708 limpach 0.0001553
328 intermediate 0.005856 709 warfiel 0.0001546
329 virtualized 0.005826 710 emc 0.0001529
330 subsystem 0.00579 711 fraser 0.0001527
331 order 0.005773 712 alex 0.0001513
332 convenient 0.005768 713 michael 0.0001502
333 time 0.005661 714 ruemmler 0.0001499
334 previous 0.005661 715 mambretti 0.0001485
335 switches 0.005652 716 chase 0.0001482
336 routing 0.005564 717 junghwan 0.0001478
337 orchestrated 0.00556 718 ramesh 0.0001438
338 reply 0.00556 719 wylie 0.0001412
339 approaches 0.005505 720 xiaotao 0.0001395
340 checkpoints 0.005482 721 jorge 0.0001386
341 cases 0.005471 722 sebos 0.0001385
342 suspend 0.005456 723 yates 0.0001374
343 bulk 0.005414 724 symantec 0.0001365
344 information 0.005397 725 pratt 0.0001327
345 encapsulate 0.005384 726 cambridge 0.0001317
346 preferred 0.005355 727 boris 0.0001299
347 modern 0.005348 728 jeff 0.0001295
348 efficient 0.005325 729 tim 0.0001289
349 solution 0.005286 730 kennell 0.0001287
350 mac 0.005283 731 ian 0.0001283
351 priority 0.005279 732 david 0.0001278
352 concert 0.005264 733 oduc 0.0001276
353 nonlan 0.005264 734 veitch 0.0001266
354 manual 0.005255 735 hand 0.000125
355 years 0.005195 736 ho 0.0001217
356 clusterbased 0.005181 737 rick 0.0001214
357 sans 0.005139 738 mendel 0.0001195
P.26
STT Từ khóa ứng cử Giá trị y STT Từ khóa ứng cử Giá trị y
358 returns 0.005111 739 hanse 0.0001193
359 flexible 0.005098 740 veritas 0.0001192
360 need 0.005092 741 gommans 0.0001144
361 scales 0.005086 742 laura 0.0001142
362 semanticaware 0.005044 743 aydan 0.000114
363 problem 0.004982 744 rolf 0.0001129
364 secondary 0.004961 745 daspit 0.0001003
365 occurrence 0.004953 746 grit 0.0001001
366 path 0.004924 747 acm 9.91E-05
367 initiation 0.004921 748 lam 9.84E-05
368 subsequent 0.004894 749 ben 9.84E-05
369 builds 0.004858 750 jim 9.81E-05
370 writes 0.004834 751 pfaff 9.61E-05
371 direct 0.004823 752 jul 9.06E-05
372 status 0.004808 753 greg 9.00E-05
373 metrics 0.004762 754 ji 8.76E-05
374 starts 0.004734 755 monga 8.70E-05
375 forms 0.004701 756 ruth 8.66E-05
376 change 0.004692 757 chow 8.39E-05
377 situation 0.004689 758 vol 6.78E-05
378 exposition 0.004662 759 jog 6.70E-05
379 addition 0.00465 760 ume 5.71E-05
380 selections 0.004642 761 ma 5.24E-05
381 scenarios 0.004629 762 rpo 1.36E-05