ICFA SCIC Dec '01 Meeting, CERN, Geneva

Author: Les Cottrell. Created: December 8


Attendees: 

Harvey Newman (Caltech), Dean Karlen (Carleton), Richard Hughes-Jones (Manchester), Dennis Linglin (IN2P3), Manuel Delfino (CERN), Slava Ilyin (MSU), Matthias Kaseman (FNAL), Yukio Karita (KEK), David Williams (CERN), Les Cottrell (SLAC) , Michael Ernst (DESY)


Introduction

ICFA chairman wants the SCIC to continue, SCIC chair asked to be relieved from chairmanship. ICFA chair asked for input & suggestions and agreed to Harvey Newman to be the new chair. Official changeover will be Feb 14-15 at the ICFA meeting at SLAC. Want to also give a short interim status report at the SLAC meeting. Important part of mandate is to also look to the poorer connected HEP collaborator countries.

Organize agenda:

Status reports will go first.

Harvey US & Trans Atlantic networking, Grids

Russian international connectivity - Slava

Dean on Canada

UK to US & Europe connectivity plus other projects on networking & grid RHJ

Switch to GEANT & wavelengths & interconnects and comments from HEPCCC- Manuel

Comments on Europe to poorer connected countries - David

Performance between many countries & high performance network measurements – Les

Status of German networks - Michael

Update on US connectivity from CHEP – Harvey (see US status)

Large amounts (many hundreds of TBytes already (BaBar > 400TB)) of data and data Grids are emerging.  By 2005 major Labs will need 2.5 - 10Gbits/s. The model for HEP computing is based on hierarchical tiers (tier 0 for Lab, tier 1 major region/country computer centers, then to institutions, then to departments, then to users). Upgrades are quickly used, typical within a month of installation it gets heavy use.

HEP plan for bandwidth FY2001 thru FY2006 can be found at http://gate.hep.anl.gov/lprice/TAN. CERN 155Mbps to 2*155Mbps this year, 622 Mbps in April 2002. DataTAG 2.5 Gbps research link in Summer 2002, 10Gbps research link in 2003 or 2004. Big surprise in drawing up the TAN review was that stated requirements were bigger than expected. In particular the reality for BaBar distributing data to IN2P3 was much greater than expected earlier. Big effect was the cost of people and the need to actually really make them effective collaborators, and the need for bandwidth to effectively use these people, while at the same time the cost of bandwidth was dropping and the cost for people was increasing.

In the Hoffman report network requirements are dominated by the high performance sites, but are well ordered and very conservative (lot of potential for growth) even for remote interactive sessions there are large requirements and are restricted (e.g. only 30 interactive active sessions).

To indicate that these numbers for HEP are in line with the rest of the Internet, the growth of US Internet traffic has been 2.8 times per year and now has grown to a factor of 4 per year. By 2010 this corresponds to 10Pbps. One of the possible drivers (according to Roberts) for the need for the bandwidth is an Internet TV channel per person in 2006 in the 1st world. Another confirmation is that the Amsterdam Exchange bandwidth has grown by a factor of 4 in the last year. Such Growth in the backbone bandwidth requires a large investment in local infrastructure also.

In the last year for transatlantic links bandwidth has gone up by factors of 3 to 16. Sylvan Revot has achieved 120Mbps between CERN and Caltech. He applied a modification to the Linux kernel to improve the behavior of ssthresh.

There is an Internet 2 HENP network working group. One issue is a new concept of fairness. Very low loss is required e.g. to get 1Gbps LA CERN need 7*10^7 loss rate. Problems at high speed include firewalls, router and other interface costs, TCP, losses, error rates etc.

Abilene has 2.5 Gbps backbone, reaches 50 states, partnership with QWest is extended through 2006, the backbone will be extended to 10Gbps by 2003. Internet 2 has proposed 100 wavelengths for research. Engineering staff are very helpful, open and willing to work with advanced users.  One fiber can hold 160 wavelengths at 10Gbps each.

ESNet has had a plan in place, which is slower than other developments (e.g. Internet2) and in particular do not meet HENP growth needs. They will need to evolve quickly to keep up with Internet 2 and also HENP and other user needs.

Wireless is evolving quickly to higher bandwidth. The Isle of Man is now deploying 3G phones (as of yesterday) and 3G phone will in turn drive IPv6 to meet the needs of addressing.

We will need working groups within experiments to address networking needs. Need to start applying the Grid prototypes to actual experiments.

 Russian international connectivity – Slava Ilyin

Science & education in Russia mainly centers in Moscow, St Petersburg and Novosibirsk. Sorus foundation has had a program to develop education in many regions. Spring '01 there is a state program (electronic Rusiia) was started to support telecommunications (not yet funded for coming year). How will improve telecommunications in Universities. There is a Russian backbone network connecting Moscow (in Moscow 100Mbps and have started a 1Gbps), 100-150Mbps to St Petersburg. Novosibirsk has 30Mbps to Moscow (from RBCOM). In Moscow there is MSU, Dubna, Protvino/Serpurkhov. Moscow  to Dubna is 30Mbps. Next year plan to upgrade to 155Mbps and potentially in following years to 1Gbps. There are optical fibers between Moscow and Dubna. Summer 2001 there was 6Mbps uwave between Moscow and Protvino. No prospect to go to 1Gbps. Have plans for optical fiber, distance 120km. There are no fibers at the moment for 40km of the route from Moscow to Protvino. Also a scientific center at Troitsk. Has about 8Mbps hope for 30-40Mbps very soon. In St Petersburg main HEP institute is in Gatchina, 40 km to S of St Petersburg, there is a fiber optic cable, but in private property and is too expensive (monopoly) and at the moment they have only 128kbps from Gatchina to St Petersburg.

International connectivity is provided by many telecommunications centers. There are two directions, one under ministry of science & technology (includes Labs and MSU) runs RBnet, second ministry for education (mainly universities) runs RUNnet with center in St Petersburg. MoE is 30-32 Mbps and just upgraded to 155Mbps to Nordunet in Helsinki, MoIST has FastNet project. Last 4 years MOST had link supported by some state programs at 16Mbps summer 2001. Had different technical organizations operating. There was a project MIRnet supported by MOIST and NSF with a 6Mbps link to STARTap. Peered with ESNet in June 2000 with good connectivity to US Labs. Summer 2001 MIRnet transformed to FastNet and budget merged with regular MOST budget. There was tender and winning operator was Teleglobe with intermediate PoP in Frankfurt thence to NY in USA with 45Mbps (should be 155Mbps but technical problems hope to solve in December January). This is under same technical cover of RBnet. Politically it is part of Supercomputer center in Moscow. The 155Mbps will be subdivided 90Mbps to USA commodity Internet. Within this will live MIRinet project. Connectivity to USA is in particular for Grid projects, e.g. Khurchatov and NCSA. 20-30Mbps to Europe and CERN. Main problems are getting the budget. Hope the 20-30Mbps will be funded next year and a possible source is INEAS (a foundation to improve European connectivity to FSU).

Connectivity to KEK from Novosibirsk at 128Kbps and onto Moscow at 128kbps. 

There have, so far, been no discussion/relations with GEANT on how to connect to Russia.

Canada - Dean Karlen (see Canada status)

CA*net3  has 8 wavelengths available at 192Mbps.Connects to STARtap, also 155Mbps to GEANT. Await announcement for CA*net4  in next Monday will have $100M for 20 years. Will have customer owned wavelengths. Connectivity within Canada is acceptable, to ESNet good to acceptable, to .edu very varied, Germany big improvement good to acceptable now, UK similar, CERN is excellent, Italy good. Carleton also doing higher frequency and obtain throughput using MSS/(RTT*sqrt(loss)). Also starting netperf measurements. 

Network performance measurements - Les Cottrell (see Monitoring status)

The SLAC led IEPM/PingER project continues to gather data and to be developed by SLAC and FNAL with contributions from DL and UCL. There are about 32 monitoring sites in 14 countries monitoring hosts in 72 countries that between them contain all the countries that have computer sites in the Particle Data Group booklet, over 78% of the world's population and over 99% of the world's Internet connected population. It is very lightweight in impact on the network or requirements for the remotely monitored sites. It provides information on several metrics including round trip times (RTT), jitter, various measures of loss, duplicate and out of order packets. It is particularly valuable for links with limited performance (e.g. to the developing world).  The data goes back almost 6 years. Data and graphical and tabular analyzed summaries with user selection and drill down are available worldwide via the web. The IEPM web site receives over 2000 hits per day. The main community it caters to is HENP but the over 500 sites monitored include many national labs, commercial sites, and the IPV6 test network.

PingER measurements indicate performance for most developed nation links is improving at up to 80% / year, though in some cases (e.g. ESNet to ESNet sites) it is beginning to flatten out. Links are still poor to bad to developing areas such as Latin. America, Africa, the Middle East and the Indian subcontinent, the Former Soviet Union and Eastern Europe.

 To meet the need to measure and understand high performance links and Grid applications such as file replication SLAC is  setting up a project to measure network and application (various file copy mechanisms) throughputs. This includes about 30 high performance remote sites with high speed links (>= 10Mbits/s) in 7 countries. Some of the early questions we wish to address include: how to optimize the TCP window sizes and number of parallel streams; the duration and frequency of measurements; host dependencies (OS, cpu utilization, bus, disk and interface performance); impact on other users; use of QoS, application steering using network information and/or application self limiting; compare various file copy tools, against one another, against other bandwidth prediction tools and against simpler mechanisms such as PingER; is compression useful and under what circumstances.  So far we have been making measurements of ping, traceroute, iperf, and bbcp since mid October 2001, and are starting to analyze the data.

Early results indicate:

The next steps are to make the measurement taking more robust, further understand the impacts of compression, add and understand bbftp, gridFTP and other bandwidth measurement tools such as pipechar. Following this we will select a representative minimum subset of tools to make measurements with. improve the reporting/graphing/tabel tools and make the data available via the web. We also hope to tie together the measurements being made in the UK with the SLAC measurements so they appear more integrated to the user.

Network connectivity & performance a view from the UK - Richard Hughes-Jones
(see UK status)

SuperJANET4 backbone & links supplied by WorldCom.. Core is running well, maybe problems in MANs or site networks. Losses from RAL to CERN due to 155Mbps ATM circuit from DANTE to UKERNA which only supported 133Mbps. Connectivity to Europe is now via GEANT.  Now have 6*155Mbps (POS) from UK to US. About to go to 2.5Gbps. Will split commodity and research. Peers in Hudson street. Using 88% of total 930 Mbps measured over 10 minute interval, i.e. close to limit, daily average closer to 50%.  Traffic from US =3*from UK. 

Grid network projects: DataGrid, GridPP (extending DataGrid to BaBar, Do, CDF, .. in UK. DataTAG: foci network research, Grid interoperability; MB-NG (Managed Bandwidth Next Generation) UK core science funded 2.5Gb development network; optical switching. DataGrid WP7 is network monitoring. Several tools: PingER, RIPE one way times, iperf, UDPmon. rTPL, GridFTP and NWS prediction. Continuous tests for last few months to selected sites: DL, Maan, RL, UCL, CERN, Lyon, Bolgna, SARA, NBI, SLAC. The aims of monitoring for the Grid are to inform the applications via the middleware of the current status of the network input from resource broker and scheduling, to identify fault conditions in the operation of the Grid, to understand the instantaneous, day-to-day and month to month behavior of the network - provide advice on configuration. Report written on using LDAP. UDPmon measuring throughput with UDP. Send a burst of UDP frames spaced at regular intervals. Gives one way jitter

Added LDAP interface to PingER for users to access data. Also monitor with Netmon, UDPMon, iperf and RIPE. Seeing nice correlations between PingER and UDPmon, and Iperf with UDPmon. So UDPmon might be able to give a rough estimate. NWS predictor tracks the average. Iperf UCL to SARA now gives 90Mbps with a 100NIC on local host. 

How do we monitor at Gbps, need tcpdump on steroids. 

DataTAG approved: CERN/INFN/INRIA/UvA/PPARC are the partners. Need interoperability between Grids in Europe & the US (PPDG, GriPhyn, iVDGL). 

MB-NG will use MPLS with managed bandwidth and QoS provision.

GridPP and DataGrid have been working on packaging Globus vs. as an .rpm kit. Makes installation simpler, and is available from DataGrid WP6 web.

SuperJanet4 stable working well. Access to campus becoming important.

David Williams

The EU commission is interested in trying to improve global connectivity to less well-connected regions of the world on a political basis. Four regions: Balkan (S.E. Europe), newly independent states (NIS) of FSU, N. Africa, S. America. There is a project (UMednet signed and is now happening) to connect many of these countries into GEANT (not into 10Gbps core but in periphery). GEANT is now 31 countries. There is a political agreement to improve connectivity to S. America (signed and happened). Balkans is more complex. Many schemes some by countries, some by NATO, UNESCO etc.  States that will access EU (Accession States) will have special arrangements. The SCIC should pass information on opportunities and support relevant proposals. A pre-requisite is that universities have good connectivity.

EU Sixth Framework program (Fifth running out in 2001, gave impetus for DataGrid, DataTAG & TEN-155), like to have funds flowing from it in 12 months, may get approved in coming weeks or else in June. Still discussions on focii and funding. It will last for 5 years.

French connection - Denis Linglin

RENATER 2 connected to GEANT at 2.5 Gbps. Tender for RENATER 3 in the pipe. OC3 & OC12 will go to 2.5Gbps. Link to STARTAP to improve to 622Mbps next year. Budget is 20M Eu per year for RENATER. RENATER is not PP offspring, set up in parallel to PP nets. Monitoring stats not always available.

HEP connectivity. National 18 French Labs connected by private link to RENATER POP (not via University) and thence to CCIN2P3. Each Lab has reserved bandwidth between 2 & 34Mbps. Most international connectivity private and 80% due to BaBar. @ International links, 34Mbps reserved bandwidth within RENATER to Chicago used for BaBar uDSTs and well saturated used bbftp. Rest of BaBar, Do, CERN etc. is going through CERN and thus to GEANT and uSLIC to US. Using 80% of line. CERN Lyon link 155Mbps going to 622 Mbps in June/July. Transmitting 1TByte/day SLAC to IN2P3. Current limit is 100Mbps through CERN. RENATER goes to 100Mbps in January 2001. Very sensitive to local bottlenecks both lines limited at moment, may be due to ESNet.  CCIN2P3 has 10Gbps to Paris and then via GEANT at 2.5Gbps to CERN. Unclear whether will keep private line to CERN. PP is driving research networks at the moment. 

Politicians do not like parallel independent research networks. In France there is a research network VTHTP setup but PP not allowed to enter research only to use (i.e. HEP are not regarded as network researchers). Same for DataTAG, INRIA can enter but not IN2P3.Only way for IN2P3 to get involved is via WP7 but French leadership of WP7 is given to INRIA.

CERN - Manuel Delfino

There is approval of the LHC computing DataGrid. DataTAG is 50% to act as network research, 50% for interoperability between grid projects. Will implement links by using wavelengths. NIKHEF set up wavelength link to CERN. CERN US link often gets attacked. There is a request to close triangle by a wavelength link from Amsterdam to CERN. Within Switzerland there will be better connection between ETH schools probably done by wavelengths. It will probably connect to France & Germany. CERN has set up an OpenLab (see http://cern.ch/openlab/) project with Cabletron, KPN & Intel. The CERN openlab is a collaboration between CERN and industrial partners to develop data-intensive Grid technologies to be used by the worldwide community of scientists working at the next-generation Large Hadron Collider. They hope other vendors will join.

Germany and FSU - Michael Ernst (see Germany status)

Transatlantic connectivity is 2*622 Mbps ending in NYC and Hamburg and Frankfurt. US-Germany peaking at 120TB/months, it is becoming more balanced from/to US with from US still bigger. DANTE plan in 2002 for direct links to Abilene/CANARIE. UCAID will add another 2*2.5Gbps, and proposing for 2002 a joint project GTRN (Global terabit Research Network). Expect to upgrade 1Q2002 2*622Mbps to 2*2.5Gbps, contract between DFN with 2 providers Global Crossing and KPN QWest (2.5Gbps each), tendering in collaboration with DANTE, with same termination points. ESNet peering is at 34Mbps. 

GEANT 9 trunks at 10 Gbps and 11 at 2.5Gbps. Now 31 countries (just added 6 countries). Guaranteed QoS (www.dante.net.sequin/) There are big differences in the lowest and highest bid offerings (e.g. today average is 5000 minimum is 40). 2.5/10Gbps are SDH single wavelength.  Only 10% more for 10Gbps vs 2.5 Gbps. Traffic to Europe from DFN is about 40TByte/month (1/3/ the Transatlantic). DESY has 155Mbps to DFN. Bottlenecks are at universities not in DFN, so there are no requests for managed bandwidth.

FSU connections via the satellite. Operational satellite Yerevan 192/198kbps, Minsk 32/32. Almaty 512/128, Baikal 38/38, ... Now there is a SILK project with NATO backing. Was initially Caucasus and Central Asia (8 countries). Bandwidths from NATO currently 64-512kbps. Want to go up by a factor of 10. Not affordable by current model. There is no affordable fiber in Caucasus or Central Asia. So propose a VSAT technology upgrade. Can get Mbps for $25K/year. Funded by NATO in range $2.5M, co funded by national governments.  Will be staged and bandwidth will increase from 1 to 5 to 11-25Mbps. New funding will allow further increases up to 50Mbps. Satellite provider is a French/Turkish collaboration (EurasiaSat). DESY will provide technical management (earth station at DESY). Project is ready to start; technical and organizational parts are now in place. Will start early 2002.

Japan & KEK - Yukio Karita
(see Japan status (local) or http://www-nwg.kek.jp/~karita/icfascic-dec01.ppt)

SuperSINET will start service in January 2002. Intended to be fully photonic, 10 lamdas fromKEK. One for 10 Gbps IP service, 7 for direct GbE (or 10GbE) to university HEP. HEP VPN within SuperSINET with MPLS for HEPnet-J. NII US/Europe link SuperSINET plan for 2003 with 2 lambdas to W. Coast (lambda is 2.5Gbps) one for IP backbone, one for direct GbE to be used for KEK-CERN testbed (Canarie can bridge SuperSINET lamda and DataTAG lambdas? Minor iupgrade in Jan 2002, 5 OC3 POS for default IP traffic. Jan 2002 KEK-ESnet will increase from 10->20Mbps. 

LInks within Asia. Kek-Taiwan (Academica Sinica) 1.5Mbps FR to be merged with Academica Sinica APAN 45Mps. KEK-BINP 128kbps upgrade 512kbps, waiting for US support for Russian half circuit. KEK-IHEP (Bj) 128kbps was to be merged with CAS but may continue to exist. APAN has many inter-regional lines, mainly centered an Japan. There is an ACFA network project, will use APAN for many connections.

Harvey Newman information from Olivier Martin (see US-Cern link status)

US-CERN link consortium (USLIC) has been reliable apart from one 1-day outage during a holiday, and phone link to login to router bill unpaid so link disconnected. PIX firewall has a problem with throughput so looking at bypass. CERN has Gbps to SWITCH, to GEANT has 2.5Gbps shared (CERN has 1.25Gbps and expected that access bandwidth to GEANT will double every 12-18 months), 2*155Mbps to US. CIXP continues to grow. CERN has 200 fibers coming in.  Did market survey with a call for tender. Sent to 29 providers, 19 replied., 17 invited, 13 responses, short list was 4 providers, final decision Dec 12. Will be 622Mbps production, 2.5 Gbps test for DataTAG. DataTAG 3.9M Eu partners are INRIA, INFN, CERN, U of Amsterdam, PPARC with US (NSF) partners. 

Discussions etc

Need to focus on what to present to ICFA. Will have 30 minutes for report each presenter should make a short (up to 2 page) summary of their area to be submitted before Xmas. Harvey will put together into executive summary and get feedback. Talks will be made available on the web.  Mathias will put talks on the web.  So what is the message we want to present to ICFA.

The next meeting will be after the ICFA meeting (at SLAC in February). Would like to do soon after ICFA. First week of March is favorable (Saturday 9th March at CERN).