« Malaysia glocal | Main | theSun missed it! »

Streamyx Crisis: TM Wholesale finally owning up

EXCLUSIVE! The month-long Streamyx Crisis that rattled Zamzamzairani Mohd Isa -- the TM Malaysia Business CEO who commands operations at TM Wholesale, TM Retail and TM Net combined, and reports to Super CEO DAWO -- is coming to a formal closure this morning.

When the select group of TM senior officers meet at the Olympia Hall at Wisma TM Annex1 (Cygal) this morning, a Streamyx Crisis Management model is expected to be presented and adopted. Hopefully, your Streamyx woes caused by TM Wholesale, the TM subsidiary that controls and cross-sells the entire backhaul and network infrastructure to Streamyx broadband operator TM Net, will not re-occur.

This November 16 meeting, according to Little Birds, will also nail home TM Wholesale's dire needs of a Crisis Management Process, a Change Management Process and Improvement Action Plan.

According to the Streamyx Crisis Management model, which was drawn up by TM Wholesale team and finalised last week, the current month long Streamyx snarl-up is classified as a Crisis, one notch below Disaster.

StreamyxCrisis.jpg
Note that the commanding officer to restore Streamyx's business continuity
is the COO of TM Wholesale


Marconi, Juniper in spolight

As the current Streamyx problem happened protractedly in the midst of nationwide DSLAM upgrades, which were coincidentally compounded by outages in several ATM (Asynchronous Transfer Mode) links, TM Wholesale has decided to haul up two key vendors, requiring them to rectify the problems with utmost urgency.

Screenshots was made to understand that TM's General Manager in charge of Secure Network Operations Centre (SNOC), Rukiah Ahmad, has been instructed to notify ATM vendor, Marconi Malaysia, to explain fully on peculiar incidents related to abnormal ATM traffic utilisation and cell loss.

StreamyxCrisis_ATM_Peak.jpg

StreamyxCrisis_CPU_Peak.jpg
A comparison of peaks at ATM Trunk Utilisation and ATM CPU Utilisation
benchmarked against averages in October

At the same time, the SNOC GM has been instructed to notify Juniper Malaysia, requiring it to "explain and ascertain" the peculiar behaviour at two exchanges at Kelana Jaya (ERX KLJ03) and Cyberjaya (CBJ02) during the crisis.

StreamyxCrisis_CPU.jpg

Apart from the abnormal behaviour mentioned above, there is also an incident of High CPU Utilisation recorded in Cyberjaya (ASE CBJ10) on November 3, 2006.

Besides, there was an incident of SCP (Service Control Point) Switchover. Network security experts advising Screenshots say undetermined continuous internal process is the key suspect to have caused High CPU Utilisation and followed by a switchover.

However, there is another view that qualifies to say the particular High CPU Utilisation on this switch did not impact BRAS (Broadband Remote Access Server) session until the switchover occurred.

Nevertheless, SNOC head Rukiah Ahmad has been asked to re-study the entire escalation process for Streamyx degradation.

Screenshots was also made to understand that, at the behest of Zamzamzairani, a meeting was called on November 8 by TM Whole CEO, Baharum Salleh, to map out the immediate action plan and a viable model to thwart future crisis.

Meanwhile, Zamzamzairani seemed to have been convinced that all upgrading to Streamyx networks is to proceed as planned.

At the earlier phase of the Crisis, insiders say Zamzamzairani has considered to call off the upgrading of localised networks currently being imeplemented by Juniper Malaysia, where port speed at the localised exchanges and end-users' sync speed, were increased from 1Mbps to 1.5Mbps for those on DSLAM, and from 1.5Mbps to 2Mbps for those on RDSLAM and RTDSLAM, respectively.

It is understood that some 60% of the DSLAM had been upgraded from 1M to 1.5M between October 7 to October 18, 2006

Meanwhile, TM Wholesale is expected to adhere to a more controlled approach for the remaining of the upgrading exercise, and all interested parties within TM Wholesale, TM Net and the in-sourced call centre, VADS, are expected to be kept informed as to the schedule and implementation areas to avoid similar crisis

The new mindset includes the immediate setting-up of a crisis team at the TM Net Operations GM level upon notification of any abnormal customer complaints.

Besides, the head of TM ANOC (Alternate Network Operations Control) has been tasked to expedite the setting up of the Broadband Management Centre.

VADS at wits' end

It is noted VADS, the customer service operations in support to TM Net and Streamyx end-users, has been the party taking the full brunt of customer complaints throughout the crisis.

VADS was blamed for being unable to help Streamyx customers in trouble-shooting and rectify the various problems they faced from October through mid November.

However, people in the know of call centre operations told Screenshots that VADS was not supported with sufficient resources to zero in to the customers' in-situ connection details.

For example, the ICOMS i.e. the convergent voice, video and data billing and customer care solution used by VADS), does not have access to other critical databases that could tell more about a customer's localised parameters for fault detection. Using the present version of ICOMS, VADS is unable to pinpoint the precise faults that may vary from a determined exchange, DSLAM, or port.

Sources told Screenshots that, learning from the current crisis, VADS will now be enabled to provide analysis of the complaints into DSLAM and exchange areas for faster troubleshooting and restoration of Streamyx services.

In addition, VADS will also be given access to TM BMS (basic mappings of network installations) for faster identification of problematic DSLAM areas.

However, it is learnt that this additional access granted to VADS may be withdrawn once the crisis is overcome.

Screenshots to probe further

Since being alerted on October 18 about the current protracted Streamyx snarl-up, Screenshots has been talking to ISP industry vendors, consultants and specialists in network deployment, and even customer call centre operators, to probe the unreveiled.

When the information gathered in the research is analysed, the jigsaw puzzle takes shape. The most likely culprit is none other than TM Wholesale.

Several reliable sources told Screenshots that the current Streamyx problem was first detected as early as October 2, when there was a power trip at TM's RADIUS (Remote Authentication Dial In User Service) at the Brickfields Data Centre.

The incident later developed into a complex state of problems at the end-user side -- sporadic but nationwide -- where two patterns of connection difficulty were detected: ( 1 ) DSL blinking; ( 2 ) DSL stable but cannot login.

Subsequently, the problems escalated from the Klang Valley to include areas classified by TM Wholesale as CBA (Critical Business Areas), namely Wilayah Persekutuan Kuala Lumpur, Wilayah Persekutuan Labuan (where Malaysia's offshore banking is located), Putrajaya, Penang and Johor Selatan.

Screenshots was made to understand the problem of "DSL Blinking" was most likely to have been caused by the upgrading process currently being implemented by Juniper Malaysia, which has been noted to give rise to poor Signal-Noise Ratio (SNR) Margin.

Insiders in the network vendor industry told Screenshots that the recent port speed upgrading implementations have caused two major impacts on Streamyx Quality of Service (QoS):

  1. It affects Streamyx subscribers at boundary condition where increasing the speed will reduce SNR margin that can cause permanent or intermittent DSL blinking;
  2. The average transport traffic -- as a consequence of port speed increase -- was expected to go up up to 25%. This sudden increase can lead to network congestion, causing high latency between network elements and ATM cell loss.

    Under such a stressful environment, industry insiders say, any momentary failure of a network element can cause a sudden burst of authentication attempts that flood RADIUS (Remote Authentication Dial In User Service) and BRAS (Broadband Remote Access Server) ability to process Point-to-Point Protocol (PPP).

To rectify the problem, TM Whole sale has been advised to downgrade selectively,starting from October 19, subscribers on boundary condition that do not meet the following conditions:
1 ) SNR margin greater than 12 dB
2 ) Attenuation less than 48 dB

On the other hand, there seems to be a variety of most likely causes for the "DSL Stable but cannot login" problem, which includes:
1 ) RADIUS (Remote Authentication Dial In User Service) outage
2 ) RADIUS failed to authenticate users
3 ) High latency between network elements
4 ) BRAS (Broadband Remote Access Server) failure in processing Point-to-Point Protocol (PPP)
5 ) ATM (Asynchronous Transfer Mode) cell loss
6 ) Sudden burst of authentication attempt

All these have prompted for caution letters to be issued to the two major TM vendors, namely Marconi and Juniper Malaysia, requiring them to revert with detailed clarification at full speed.

Expect more expose if these two international-lined vendors and TM Wholesale don't buck up soon.

And I particularly hate the fact that TM Malaysia Business, under CEO Zamzamzairani, has the audacity to keep us Streamyx subscribers in the dark for over one month throughout this Streamyx Crisis! If you don't tell us why we have to pay full rate for your sub-standard service, we will find it out and tell the world -- for ourselves and by ourselves.

TO BE CONTINUED...

TrackBack

TrackBack URL for this entry:
http://www.jeffooi.com/mt32/mt-tb.cgi/1059

Comments

Dear Jeff,

I wonder if it would be possible for the TM Net user group that you chair to press TM Net to offer a discount, or even better, to write off this month's bill.

I know it sounds like a lot of hot air, but I'm sure many users would agree that if TM Net did not provide the service they said they would in the contract, then morally they should make amends by doing something in the user's favour.

But then again, we are not a very moral country, are we?

"And I particularly hate the fact that TM Malaysian Business, under CEO Zamzamzairani, has the audacity to keep us Streamyx subscribers in the dark for over one month throughout this Streamyx Crisis! If you don't tell us why we have to pay full rate for your sub-standard service, we will find it out and tell the world -- for ourselves and by ourselves".

They are hoping that they can get away with it without causing a ruckus. Trouble is, these people thinks that the populace is still backward. It would have been great for them to come clean from the onstart to admit their problem and tell their customers that they are doing something about it. We would all have been more patient and more forgiving. Now, the least they can do is to give us users a discount for their sub-standard service and downtime we experienced.....especially when the cost-of-living for the general Malaysians is so costly nowadays.

Idiot vendors. Pfft.

Is limiting/blocking p2p also part of this 'upgrades'? Vendors being consultants probably suggest this features (while at the same time selling equipment/software for the mention tasks) to TMnet, in hoping that they can regain back bandwitdh, instead of buying more. So in the end customers (end-users) pays more, get less.

It's near the end of 2006 already and we're still using 1Mbps (sorry it's 1.5Mbps)!!?

Correction aredale, you only got around 600~850 kbps speed on "best effort" for 1Mbps package. Only the 512 Kbps package actually got it as 512!!!

You need them to cap it at the DSLAM port for 1.5Mbps, and have an account cap thru software system for 1 Mbps, ONLY THEN you get the true advertise subscription.

wooo..thkx.helps in my network studies :P getting better understanding hahah.
nice one

Hi Jeff
Err could you ask them to include Klang as well. Not just the above problems, I am also been downgraded to 512 kbs package for 2 weeks even though been a paying 1 Mb customer. First Port, then cable after that port ( according to various VADs ) supposingly the problem but till todate no action taken. Why? not even VADS could tell me coz the technicians yet to come back to them since 7th November though verbally mentioned it's their fault.

Wrong. 512 doesn't get 512. The best I've ever done is 60kB = 480 kbps and that was only for a 10 minute period in the nearly three years I have been on Streamyx.

Occasionally it hits 62kB but only for a few seconds and then it drops right back so that could be just a glitch.

Most times it is only 40-50 kB or 320-400 kpbs.

Wrong. 512 doesn't get 512. The best I've ever done is 60kB = 480 kbps and that was only for a 10 minute period in the nearly three years I have been on Streamyx.

Occasionally it hits 62kB but only for a few seconds and then it drops right back so that could be just a glitch.

Most times it is only 40-50 kB or 320-400 kpbs.

I seconded that. Since I have been stuck with 512 kbs package, my speed is always in between 350 kbs to 430 kbs. Never reach the max of 512 kbs at all. Even when at 1 Mb package, my speed then was between 620 kbs to 750 kbs ( using the speedometer without any programs on ) I miss the good old times.

Power trip.
But after all, it is just a RADIUS.
Could it affect the sync to DSLAM?
Could it affect slow bandwidth?
Is that really the cause?

Even at Brickfields iDC itself, the machines are losing connectivity to the local gateway from time to time from last month until now. Wondering what is causing that.

Anyway, they are getting 3com-Huawei to replace Cisco in doing switching. And implementing some Juniper firewall for hosting customers soon.

Cheers!

i learn a lot from this topic. really helps in my network studies.

I've worked with Zamzamzairani at a start up company called PactZone Digital Sdn. Bhd. back when it started 2003. I was in fact one of the pioneers in the company. We were a VoIP (Voice over Internet Protocol) telecommunications provider and the potential then was huge for us. The market was wide open for VoIP to reap all the rewards.

Zamzam was our exalted Chairman. Let's just say PactZone Digital closed one year later after going through really bad management. Everyone was laid off without months of pay because apparently the company managed to incur a RM2 million debt with Telekom Malaysia within a year!

To think it's the same man now running things in Telekom is really surprising. So don't put all your hopes on Zamzamzairani to steer TM and TMnet in the right direction. He just might steer it into the ground.

amen tmnut

Does anyone notice its much more difficult to access Maxis and DiGi WAP Gateways too?

With Maxis Gateway access slowing down drastically, I'm pretty sure it began on Oct 2nd, came up just before Deepa-Raya, then went blonkers from Nov 11th again.

Im getting easily 171 kB/s for RM88. Hahaha.. no complaints, yet..

I dont expect Streamyx to be upgraded anymore, just maintain. And of course tend to those with difficulties.

I can't check my mail. At all. either throught Outlook or Web Access. Or use MSN. The e-mail just hangs and the MSN won't recognise me. A check with the troubleshooter said it was a problerm with the Ports, but when I tries to repair, it couldn't.
Is this related to this current debacle?
I've had these problems intermittently for the past month, and thought it was my settings, but now I see the very technical sounding stuff here, I think my problems may be similar to what other people are experiencing. Can someone confirm this (or not) for all the not so techie people out here. and suggest (also for the layman to understand) how it might be resolved.
Thanks! and thanks to Jeff for bringing attention to this problem.
Obviously my actual connection is fine to surf the web, or I wouldn't be typing this.

at last, tmnet announce the crisis on their website.. but as always, they didn't tell you what is really going on..

Exarkun,
Pactzonedigital was run by idiots who siphoned off money and created a big hole. Zam came in at a later stage to turn-around the company with the help of new management team. Things got really bad when banks turned down their rescue plans. So, please, give the man some credit and don't run him down for whatever good he was trying to do. If you got suckered at PZ, blame the previous owners who cashed out.. it was daylight robbery.
Zam is facing a big task at TM and my advise to him is to not even trust the very people running the show downstairs.. He was put there by AWO to clean up the mess. Get your facts correct ok.. I've known him for more than 30 years.. he is a clean and god fearing man with God, King and Country as his motive in life.

Well,

P & C document just leak out..........I think TM will look on this matter........

TM need challenge and Michael Lai is not resigning.......

INTERNET does not operate in a legal vacuum.
Read this before you post a comment in this blog!

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)