How mobile operators analyze our data |
How mobile operators analyze our data At the same time, subscriber data is running everywhere, access to which it is desirable not to provide outsiders. How secure is the information in such a system, taking into account all the chains? Let's figure it out. Why do mobile operators need prebilling? With the help of pre-billing, various data reconciliation and reloading are implemented. For example, reconciliation of the status of services on equipment and in billing. It happens that the subscriber uses the services despite the fact that he is already blocked in the billing. Or he used the services, but there was no record of this from the equipment. There can be many situations, most of these moments are solved with the help of pre-billing. I once wrote a term paper on optimizing the company's business processes and calculating ROI. The problem with calculating ROI was not that there was no source data — I did not understand which "ruler" to measure them with. About the same often happens with prebilling. You can endlessly customize and improve processing, but always at some point circumstances and data will develop so that an exception will occur. It is possible to ideally build a system of operation and monitoring of auxiliary billing and pre-billing systems, but it is impossible to ensure uninterrupted operation of equipment and data transmission channels. Therefore, there is a duplicate system that checks the data in billing and the data that has gone from pre-billing to billing. Her task is to catch what went off the equipment, but for some reason "did not fall on the subscriber." This role of duplicating and controlling the pre-billing system is usually played by the FMS - Fraud Management System. Of course, its main purpose is not to control pre-billing at all, but to identify fraudulent schemes and, as a result, monitor data losses and discrepancies from equipment and billing data. In fact, there are a lot of options for using prebilling. For example, it can be a reconciliation between the subscriber's state on the equipment and in the CRM. Such a scheme may look like this. Using SOAP pre-billing, we receive data from equipment (HSS, VLR, HLR, AUC, EIR). Another example of use is the accumulation of data and their further processing. This option is possible when we have thousands of records from equipment (GGSN-SGSN, telephony): throwing all these records into the subscriber's details is utter madness, not to mention the fact that we infernally load all systems with so much small data. For this reason, the following scheme will work, which solves the problem. Getting data from the equipment. Hewlett-Packard Internet Usage Manager (HP IUM) Prebilling Imagine a large meat grinder, into which meat, vegetables, loaves of bread are thrown — everything that is possible. That is, there are a variety of products at the entrance, but at the exit they all acquire the same shape. We can change the grate and get a different shape at the output, but the principle and way of processing our products will remain the same - screw, knife, grate. This is the classic pre-billing scheme: data collection, processing and output. In IUM prebilling, the links of this chain are called encapsulator, aggregator and datastore. Here it is necessary to understand that at the entrance we must have completeness of data — a certain minimum amount of information, without which further processing is useless. In the absence of some block or data element, we receive an error or a warning that processing is impossible, since operations cannot be performed without this data. Therefore, it is very important that the equipment generates record files that would have a strictly defined and set by the manufacturer set and type of data. Each type of equipment is a separate processor (collector) that works only with its own input data format. For example, you can't just take and throw a file from CISCO PGW-SGW equipment with Internet traffic of mobile subscribers to a collector that processes the stream from Iskratel Si3000 fixed-line equipment. If we do this, then at best we will get an exception during processing, and at worst we will have all the processing of a particular stream, since the collector handler will fall with an error and wait until we solve the problem with the "broken" file from its point of view. Here you can notice that all pre-billing systems, as a rule, critically perceive data that a specific collector processor has not been configured to process. Initially, the stream of parsed data (RAW) is formed at the encapsulator level and can already be transformed and filtered here. This is done if it is necessary to make changes to the flow before the aggregation scheme, which should be further applied to the entire data flow (when it passes through various aggregation schemes). Files (.cdr, .log, and others) with records of subscriber user activity come from both local and remote sources (FTP, SFTP), there are possible options for working with other protocols. Parses parser files using different Java classes. Since the pre-billing system in normal operation is not designed to store the history of processed files (and there may be hundreds of thousands of them per day), after processing, the file on the source is deleted. For various reasons, the file may not always be deleted correctly. As a result, it happens that the records from the file are processed repeatedly or with a long delay (when it was possible to delete the file). To prevent such duplicates, there are protection mechanisms: checking for duplicates of files or records, checking for time in records, and so on. One of the most vulnerable points here is the criticality to the size of the data. The more data we store (in memory, in databases), the slower we process new data, the more resources we consume and eventually we still reach the limit after which we are forced to delete old data. Thus, auxiliary databases (MySQL, TimesTen, Oracle, and so on) are usually used to store this metadata. Accordingly, we get another system that affects the work of pre-billing with the resulting security issues. More on the topic: Hiding data from the provider How does prebilling work? Modern prebilling is a set of modules, usually written in Java, which can be controlled in a graphical interface using standard copy, paste, move, drag and drop operations. Working in this interface is simple and clear. For work, an operating system based on Linux or Unix is mainly used, less often Windows. The main problems are usually related to the testing process or error detection, as the data passes through a variety of rule chains and is enriched with data from other systems. Seeing what happens to them at each stage is not always convenient and understandable. Therefore, we have to look for the reason, catching changes in the necessary variables with the help of logs. The weakness of this system is its complexity and the human factor. Any exception provokes data loss or incorrect data formation. The data is processed sequentially. If we have an error at the input-an exception that does not allow us to correctly receive and process data, the entire input stream gets up or a portion of incorrect data is discarded. The disassembled RAW stream goes to the next stage — aggregation. There may be several aggregation schemes, and they are isolated from each other. As if a single stream of water entering the shower, passing through the grille of the watering can, is divided into different streams - some thick, others quite thin. After aggregation, the data is ready for delivery to the consumer. Delivery can go either directly to the databases, or by writing to a file and sending it further, or simply by writing to the pre-billing repository, where they will lie until it is emptied. After processing at the first level, data can be transferred to the second and further. Such a ladder is necessary to increase the processing speed and load distribution. In the second stage, another stream can be added to our data stream, mixed, shared, copied, merged, and so on. The final stage is always the delivery of data to the systems that consume it. The tasks of prebilling are not included (and that's right!): to monitor whether input and output data have been received and delivered - this should be handled by separate systems; Privacy of pre-billing Often, the time from using the service to displaying this fact in the billing should not exceed several minutes. As a rule, the metadata that is needed to process a specific portion of data is stored in a database (MySQL, Oracle, Solid). Input and output data almost always lie in the directory of a particular collector stream. Therefore, anyone who is allowed access to them (for example, a root user) can have access to them. The prebilling configuration itself with a set of rules, information about database access, FTP, etc. is stored encrypted in a file database. If the login password for access to the prebilling is unknown, then it is not so easy to unload the configuration. Any changes to the processing logic (rules) are recorded in the prebilling configuration log file (who changed when and what). Even if data is transmitted directly through the chains of collector handlers inside the prebilling (bypassing uploading to a file), the data is still temporarily stored as a file in the handler directory, and if desired, it can be accessed. The data that is being processed at the prebilling is depersonalized: they do not contain full names, addresses and passport data. Therefore, even if you get access to this information, you will not find out the subscriber's personal data from here. But you can catch some information by a specific number, IP or other identifier. Having access to the prebilling configuration, you get data to access all related systems with which it works. As a rule, access to them is limited directly from the server on which the prebilling is running, but this does not always happen. If you get to the directories where the file data of the handlers is stored, you will be able to make changes to these files that are waiting to be sent to consumers. Often these are the most ordinary text documents. Then the picture is as follows: the pre-billing data was received and processed, but they did not come to the final system — they disappeared in a "black hole". And it will be difficult to find out the reason for these losses, since only part of the data is lost. In any case, it will be impossible to emulate the loss with further search for reasons. You can look at the input and output data, but it will be impossible to understand where they have gone. At the same time, the attacker can only cover his tracks in the operating system. Go back |
13-03-2022, 14:45 |