banner
leaf

leaf

It is better to manage the army than to manage the people. And the enemy.
follow
substack
tg_channel

User personas are so important! So what are the processes and methods for creating user personas?

The concept of user personas was proposed by Alan Cooper, the father of interaction design. He stated that user personas are virtual representations of real users, built on a series of real data to model target users. Remember that user personas express most of our users through virtual representatives, and the intelligence analyst wants to be more direct.

As for how founders create user personas, I hope everyone remembers one thing: you need to know what my key users, my core users look like. Are they male or female? What do they like? Or can you describe your core users in one sentence? User personas are even considered the nuclear weapon of internet companies.

For example, Tencent, Baidu, and Alibaba are known as BAT. I believe the core capability of BAT is their ability to create user personas from big data. Let me share a joke: everyone knows Tencent is strong in product development. If you create a product that catches Tencent's attention, they can quickly surpass you with their own product. Why? Because Tencent has a very powerful user mining capability.

For instance, Tencent's technology is divided into T1, T2, T3, T4, and T5. T5 is equivalent to a chief scientist, usually just one or two people. T4 has quite a few people at Tencent, dozens of them. What does T4 mean? Tencent calls it the T4 expert group, and those who enter T4 are generally technical experts who have operated on hundreds of millions of users. When Tencent encounters a problem, they consult the T4 expert group, which specializes in user personas...

User personas are so powerful, so strong, so much like a nuclear weapon. Here I want to talk about the second core point: how to do it? A founder is not a product manager; how can they create effective user personas? They need to find seed users.

Many people ask, what are seed users? Users are stratified; do you know what types of users there are? There are target users, and among target users, there are core users; among core users, there are seed users. Seed users are like seeds; they are opinion leaders among users, those who have a voice among users, and even key figures among core users.

When creating user personas, it is essential to find seed users. In fact, finding seed users is the first step for almost all companies when developing products. For example, what are Xiaomi's seed users? Xiaomi is currently a leading smartphone company in China, and its seed users are enthusiasts.

However, Huawei's sales are also among the top in the country. So what are Huawei's mainstream users? Are they the same as Xiaomi's? No, they are different. What are Huawei's seed users? They are business elites.

Looking at OPPO, OPPO's sales are also among the top in the country. Does OPPO's user persona look the same as theirs? No, it is different. OPPO's user persona is young women. Therefore, finding seed users is very important; hence, those who have seed users will dominate the market.

image

image

image

  1. What are user personas?

image

image

image

User personas are user models of target groups built on a series of real data, which abstract corresponding labels based on user attributes and behavioral characteristics to create a virtual image. They mainly include basic attributes, social attributes, behavioral attributes, and psychological attributes.

It is important to note that user personas are derived from clustering analysis of a group of users with common characteristics, and therefore are not aimed at a specific individual.

image

image

User label set

image

image

image

  1. Steps to create user personas

image

image

image

(1) Clarify the purpose of the persona

Confirming the purpose of the persona is a very basic but crucial step. It is essential to understand what operational or marketing effects are expected from building user personas, so that when constructing the label system, planning can be made regarding the depth, breadth, and timeliness of the data to ensure that the underlying design is scientifically reasonable.

(2) Data collection

Only when the persona is based on objective and real data can it be effective. When collecting data, multiple dimensions need to be considered, such as industry data, overall user data, user attribute data, user behavior data, user growth data, etc., and obtained through industry research, user interviews, user information filling and questionnaires, and data collection from platform front-end and back-end.

(3) Data cleaning

Regarding the data collected, there may be non-target data, invalid data, and false data, so it is necessary to filter the raw data.

(4) Feature engineering

Feature engineering can transform raw data into features, which involves some transformation and structuring work. In this step, it is necessary to eliminate outliers in the data (for example, in an e-commerce app, a user may obtain a phone for a few cents through a flash sale, but the average price of their daily shopping is over a thousand yuan) and standardize the data (for example, the currencies used by consumers include RMB and USD, which need to be unified) and standardize the judgment labels.

The technologies used in persona construction include data statistics, machine learning, and natural language processing (NLP), as shown in the figure. Specific methods for constructing personas will be detailed in later sections of this chapter.

image

Technologies for constructing user personas

(5) Data labeling

In this step, the obtained data is mapped to the constructed labels, and various user features are combined. The choice of labels directly affects the richness and accuracy of the final persona, so data labeling needs to be combined with the functions and characteristics of the app itself. For example, e-commerce apps need to refine labels related to price sensitivity, while news apps need to describe content features from as many perspectives as possible using labels.

image

The priority sorting method mainly depends on the difficulty of construction and the dependency relationships of various labels, as shown in the figure.

image

Priority of constructing various labels

(6) Construct user personas

The labels are divided into three categories:

The first category is demographic attributes

Demographic attributes include age, gender, education level, life stage, income level, consumption level, and industry.

Gender

Male

Female

Unknown

Age

Under 12

12-17

18-19

20-24

25-29

30-34

35-39

40-44

45-49

50-54

55-59

60-64

65 and above

Unknown

Monthly Income

Below 3500 yuan

3500-5000 yuan

5000-8000 yuan

8000-12500 yuan

12500-25000 yuan

25001-40000

Above 40000 yuan

Unknown

Marital Status

Single

Married

Divorced

Unknown

Industry

Advertising / Marketing / Public Relations

Aerospace

Agriculture / Forestry / Chemical

Automobile

Computer / Internet

Construction

Education / Student

Energy / Mining

Finance / Insurance / Real Estate

Government / Military / Real Estate

Service Industry

Media / Publishing / Entertainment

Medical / Insurance Services

Pharmaceutical

Retail

Telecommunications / Network

Travel / Transportation

Others

Education Level

Junior High School and Below

High School

Vocational School

Associate Degree

Bachelor's Degree

Master's Degree

Doctorate

Demographic labels

The second category is interest attributes

Before constructing user interest personas, it is necessary to conduct content modeling of user behavior. To ensure that interest personas have a certain degree of accuracy and good generalization, we will build a hierarchical interest label system, using multiple granular labels simultaneously to match, ensuring both the accuracy and generalization of the labels.

How to construct a hierarchical interest label? Simply put, look at what content and things users are interested in, and extract, label, and statistically analyze the content and things they are interested in.

The third category is geographic attributes

The excavation of the permanent residence is based on the user's IP address information, parsing the user's IP address, corresponding it to the relevant city, and counting the cities where the user's IP appears to obtain the permanent city label.

The user's permanent city label can not only be used to count the distribution of users in various regions but can also identify business travelers, tourists, etc., based on the user's travel trajectory between cities, as shown in the figure, which is an example of the travel trajectory of a population.

image

Travel trajectory of a population

GPS data is generally collected from mobile devices, but many mobile apps do not have permission to access user GPS information. The main apps that can obtain user GPS information are navigation apps like Baidu Maps and Didi Chuxing. Additionally, the collected user GPS data is relatively sparse.

Baidu Maps uses this method combined with time period data to construct GPS labels for users' companies and homes. In addition, Baidu Maps also uses GPS information to count traffic flow on various roads for traffic condition analysis, as shown in the real-time traffic map of Beijing, where red indicates congested routes.

image

Real-time traffic map of Beijing

(7) Generate the persona

After the data runs in the model, the final generated persona can be presented in visual forms such as the one below. User personas are not static; thus, the model needs to have a certain degree of flexibility to adjust and modify the persona based on the user's dynamic behavior.

image

Information Collection#

Privacy#

Packet capture information

Topics actively participated in (discussions and experiences about social events)

Favorite emojis and stickers, groups and channels joined

Speech (identity, life, occupation, habits, organization, complaints, income, values, stance, etc.)

Writing style (expression, sentence structure, punctuation, etc.)

Screenshot content (fonts, application pages, icons in the notification bar, etc.)

Shared links and images (references)

Photos (people, objects, locations, iconic objects, weather, lighting, identity information, etc.)

Photos of social activities (name, event time, poster, slogan)

Regional characteristics (local specialties, cigarettes, totems, plants, terrain)

Voice (accent, dialect, age, environmental noise)

Shared files (metadata, invisible watermarks, original image exif information, file source, content)

Account information (avatar, username, signature/profile, password, same information used across different platforms)
(All domestic platforms have begun to display IP location information, is there a project that collects these domain names that display location information without a global context, allowing one-click copying and adding these domains to protect privacy?)
Solution👇

Bilibili IP Location API#

host, api.bilibili.com, Location IP

Zhihu IP Location API#

ip-cidr, 103.41.167.0/24, Location IP

Weibo IP Location API#

host-suffix, api.weibo.cn, Location IP

Tieba IP Location API#

host,www.baidu.com,Location IP

Toutiao IP Location API#

host-suffix,toutiaoapi.com, Location IP

Douyin IP Location API#

host-keyword,core-c-lq,Location IP
host-keyword,core-lq, Location IP
host-keyword,normal-c-lq, Location IP
host-keyword,normal-lq, Location IP
host-keyword,search-quic-lq, Location IP
host-keyword,search-lq, Location I

How to Infer Specific Locations from a Photo | Introduction to Network Tracing#

Introduction#

Before starting the serious tutorial, it is necessary to clarify a few points:

  1. This article will introduce a reasoning game called "Network Tracing," which infers the specific location where a photo was taken based solely on a photo and limited hints. It can be considered a form of Open-Source Intelligence (OSINT) [1] that refers to the practice of legally collecting data and information from publicly available and accessible resources.
  2. This article will not cover how to obtain and analyze "off-site information," such as "locals can tell at a glance," or how to gather identity and residence information from the questioner's historical content or social platforms. This article does not encourage the use of "human flesh search" or other behaviors that may infringe on others' privacy in "Network Tracing."
  3. The author is merely an enthusiast of "Network Tracing" and has no vested interests with the social platforms and tools mentioned. Additionally, the author is an amateur player, and the following content is a summary of personal experiences, serving as a quick-start guide rather than a rigorous professional tutorial. It is hoped that this article can help some interested readers get started with this game and also raise awareness of the privacy risks that may arise from posting photos on public channels.

Can a single photo reveal your location? | An Introduction to Network Tracing "Network Tracing" is one of the most impactful forms of open-source investigation because it appears to be highly dramatic: a single image can accurately pinpoint a location. However, this drama stems from people's underestimation of the amount of information contained in a single image and the scale and breadth of open-source information on the internet.

Note: This article aims to popularize the process of "how ordinary people can infer real locations from a photo" and hopes to raise some awareness among readers. If readers explore and research based on this article, they should respect others' privacy and relevant laws.

In 2011, a post titled "How I Inferred Wang Luodan's Address" went viral. The author inferred Wang Luodan's previous address in just over 40 minutes using a few of her Weibo posts, his knowledge of Beijing, and Google Earth. (Wang Luodan was a popular actress at the time, starring in the hit workplace drama "Du Lala's Promotion," which dates the author.) While readers exclaimed "amazing," they couldn't help but worry about being investigated themselves, expressing that they would never dare to post anything online again.

image

Related reports. Image from Sohu Media

Ten years later, in 2021, with the introduction of many enthusiasts and creators, a detective game called "Network Tracing" [note 1] entered the public eye: under the condition of only having a single image and a few hints, experts can find the location where the image was taken using only a connected computer, some even determining the time of the photo. Nowadays, internet users, while exclaiming "wow, that's impressive," can't help but worry about being investigated themselves, expressing that they would never dare to post anything online again.

image

The Network Tracing section of the Chao Fan community. Image from Chao Fan Community

image

Bilibili user "I am EyeOpener" is one of the more influential introducers of Network Tracing. Image from bilibili

The history of the internet is a cycle of "growth," but the cycle is a spiral upward. In the past decade, the number of global internet users has doubled, and the number of web pages has quadrupled. Although we haven't made much progress, this investigative technique has matured with the support of massive internet information. Its formal name is Open Source Investigations (OSI) or Open Source Intelligence (OSINT) [note 2], which refers to the techniques used to investigate using open-source information on the internet.

"Network Tracing" is one of the most impactful forms of open-source investigation because it appears to be highly dramatic: a single image can accurately pinpoint a location. However, this drama stems from people's underestimation of the amount of information contained in a single image and the scale and breadth of open-source information on the internet. Are you worried that your photos might expose your privacy? Are you curious about how detectives unravel the clues to determine the photographer's location? Today, with the introduction of this article, you too can unveil the mystery of Network Tracing, become a network detective, and become your own expert in online content security.

How to Play Network Tracing#

The Chao Fan community is a social website similar to Tieba, where the Network Tracing section is highly influential within the community. Every day, many users post their photos here, challenging "detectives." The moderator team regularly holds Network Tracing point competitions, with winners receiving exquisite trophies. (Not an advertisement, just a statement. The author has not registered yet.)

image

Content from the Network Tracing section of the Chao Fan community. Image from Chao Fan Community

Not all images are suitable for puzzles. In the Chao Fan community, puzzle images focus on urban buildings, transportation (especially planes and high-speed trains), roads, scenic spots, etc., primarily from long-range shots. If you take a picture of a trinket on your desk or a small flower by the roadside, detectives will find it challenging to extract useful information from the image content.

Network Tracing puzzles can also be in the form of panoramic images, videos, and other multimedia formats. The "GeoGuessr" introduced by the Minority Report [3] and Baidu Maps' "Panoramic City Explorer" [4] are examples that use panoramic images as carriers.

The basic idea of Network Tracing can be divided into the following three steps:

  • Extract: Carefully observe the image and extract all valid information within it. No matter how small or blurry it is, do not overlook it.
  • Analyze: Use your knowledge and internet tools to analyze the extracted information and narrow down the investigation range.
  • Verify: Use internet tools to conduct investigations until you search through the range obtained from the analysis phase. If unsuccessful, return to the first two steps and try again.

Extracting and analyzing information is the key to Network Tracing and is where the fun lies. This relies on detectives' broad knowledge, strong internet information retrieval abilities, and long-term experience accumulation.

Network Tracing detectives prefer to arrive at answers through logical reasoning rather than brute force. The more challenging the reasoning process, the greater the sense of achievement in arriving at the answer. Considering the complexity of reality, this reasoning process is not strict; it is more based on probabilistic assumptions derived from life experiences.

What is Hidden in the Image?#

To become a qualified Network Tracing detective, the first step is to learn how to look at images and uncover hidden information within them. Broadly speaking, an image can contain the following types of information: textual information, infrastructure information, and natural geographic information.

Textual Information#

Textual information is the fastest and simplest way to infer geographic locations. Compared to other types of information, textual information has significant advantages:

  • May directly reveal the location: Textual information such as road signs, government buildings, station names, and house numbers are strongly associated with geographic locations and can easily become giveaway clues.
  • No professional threshold: You may need some professional knowledge and comparison analysis to determine the species of plants or the model of an airplane, but interpreting textual information requires no such expertise; just being able to read is enough.
  • Easy to search: You can directly search for text in search engines. While many search engines support image searches, their accuracy cannot compare to that of text.

Therefore, Network Tracing detectives do not overlook any textual information in the image, even if it is blurry.

For example, given the following image and asked about the photographer's location:

image

This is a photo of a Sha County snack shop. However, directly searching for "Sha County snacks" is not a good idea—there are tens of thousands of Sha County snack shops nationwide. By carefully observing the details in the image, several pieces of textual information can be found: the adjacent "记," the reflection on the door and window showing "王府" and "旺基," the house number " 香榭 " and "23,"and the advertisement on the electric vehicle's mudguard for" 星桥莫拉克专卖店."

image

Electric vehicles rarely cross cities, so the license plate and the advertisement on the mudguard can help infer the city where the photo was taken. The city name on the license plate is blurry, but it can be seen to have two characters, so the focus shifts to the advertisement.

Searching for "星桥" nationwide, excluding vague matches like "三星大桥," leaves 12 possible options: Xiangqiao Street in Hangzhou, Xingqiao Village in Huzhou, Xingqiao Village in Sanming, Xingqiao Village in Fuzhou, Xingqiao Village in Ziyang, Xingqiao Village in Guang'an, Xingqiao Village in Guangyuan, Xingqiao Town in Chongqing, Xingqiao Village in Lijiang, Xingqiao Village in Shaoyang, Xingqiao Village in Zhuzhou, and Xingqiao Village in Xianning. From the reflection on the door and window, it can be inferred that this place has dense commercial activities, which does not resemble an ordinary rural area.

image

Nationwide "Xingqiao" (partial). Image from Baidu Maps

The advertisement also provides the phone number for "莫拉克专卖店." It is well known that the first three digits of mobile phone numbers represent the carrier, and the middle four digits represent the area code, so the first seven digits of the phone number are sufficient to determine the number's origin. This may not necessarily be the photographer's location, but it is likely true.

image

The phone number is somewhat blurry, but the visible digits in the first seven digits are "1508*64," with the fifth digit resembling 3, 5, or 8. Queries reveal that 1508364 belongs to Xinyu, Jiangxi, 1508564 belongs to Zunyi, Guizhou, and 1508864 belongs to Hangzhou, Zhejiang. Comparing with the search results for Xingqiao, only Hangzhou overlaps. Therefore, it can be tentatively assumed that the photographer is in Hangzhou, and the next step is to search.

Next, notice the house number " 香榭 " and "23."The content of the house number could be a road name, community name, or village name. Considering the dense commercial area nearby, it is likely a road name. The content after" 香榭 "is obscured, but based on its proportional position, it should be" 路 "or" 街."

image

Searching for "香榭路" in Hangzhou, indeed finds a road named 香榭,which belongs to Xingqiao Street.

image

Xiangxie Road in Hangzhou. Image from Baidu Maps

In this area, searching for Sha County snacks leads to a "suspected target":

image

Suspected Sha County snack shop. Image from Baidu Maps
image

Unfortunately, the street view is outdated, and no similar storefront was found. However, the architectural style and road sign format match.

image

Panoramic view of Xiangxie Road. Image from Baidu Maps

On Meituan, this shop can be found, with the house number "香榭路 23-1," and the storefront image matches the puzzle image. Thus, it is confirmed that the photographer's location is near the entrance of the Sha County snack shop at No. 23-1, Xiangxie Road, Linping District, Hangzhou, Zhejiang Province.

image

Sha County snack shop in Tiandu City. Image from Meituan

The above is a "giveaway question" in Network Tracing, as the answer can be derived simply by analyzing textual information.

Infrastructure Information#

From large urban areas to small trash bins, infrastructure encompasses municipal, transportation, and architectural fields. Investigating locations based on infrastructure relies on the following two points:

  • Identifiability: As products of industrial society, infrastructure with similar functions often has similar appearances, allowing us to discern "what this is." Identifying large facilities such as ports, airports, and stadiums can be crucial for determining locations.
  • Regional Differences: Influenced by national and regional policies, climate conditions, and economic geography, infrastructure can vary significantly. This allows us to infer "where this is."

Here are some commonly used types of infrastructure information:

  • Landmark Buildings: Landmark buildings generally possess certain uniqueness, allowing for location identification through image searches. If they are imitations, finding them through news reports is also not difficult.
  • Urban Areas: The skyline and aerial views of central urban areas, urban villages, and urban-rural junctions differ, and the size of the city can also affect these urban landscapes.
  • Houses: Houses generally face south, which can be used to determine direction. Rural houses in different regions have different styles, such as red-tiled roofs, white walls with black tiles, cave dwellings, and courtyard houses, which can help infer the region.
  • Roads: Different types of railways and highways have unique facilities, such as railway contact networks, slopes, and isolation nets. Railway stations, highway toll booths, interchanges, and traffic signs are also important clues. Uniquely styled streetlights may also become breakthroughs in solving puzzles.
  • Vehicles: License plates can help infer the country of origin, and some can even be further specified to a first-level administrative region. If cars drive on the left, countries where cars drive on the right can be ruled out, and vice versa. City buses and taxis usually have uniform or series paint jobs.
  • Trains and Planes: The shape details of trains and planes can determine their models. Train and plane schedules can be queried online. Special paint jobs can also reveal important information. Based on the angle of the photo taken on the plane, it can be roughly judged whether the plane is taking off or landing.
  • Special Facilities: Weather stations, radar stations, stadiums, ports, and docks often have special facilities, such as stadium-specific lighting and dock gantry cranes. Identifying these special facilities requires relevant background knowledge.

Infrastructure information is the most common and primary type of information in Network Tracing, and this article cannot cover all aspects, only scratching the surface. Here, we introduce a typical case of determining a location based on infrastructure information, which comes from the blog of open-source information expert NixIntel. This expert's blog provides rich material for domestic Network Tracing bloggers.

image

The second puzzle image, from Swapfiets company

This is an advertisement photo released by Swapfiets, and the location of the photo needs to be found. NixIntel extracted the following information from the image:

  • This is a city with tall buildings.
  • The tracks on the road indicate that the city operates trams.
  • Some license plates are visible, formatted as PJ-620-*.
  • The lamp post has black and white stripes.
  • The building on the left has prominent tall white columns.

image

NixIntel visited the company's official website and learned that the company operated in the Netherlands, Germany, Denmark, and Belgium at that time. To determine which country it is in, the license plate can be used. The WorldLisencePlates website collects license plate styles from around the world, and the styles of the four countries are as follows:

image

Comparison of license plates from the four countries. Image from WorldLisencePlates

After comparison, the Dutch license plate style is the closest, so the next step is to search in the Netherlands. If it is not the Netherlands, it is not a big deal; we can go back and choose again.

Having selected the country, is there a way to narrow it down to a province or city? Looking back at the clues, the tram seems promising, as not all cities have them. Checking the Wikipedia page for trams in the Netherlands reveals that only five cities in the Netherlands currently operate trams: Delft, Utrecht, Rotterdam, Amsterdam, and The Hague.

image

Wikipedia entry for trams in the Netherlands, image from Wikipedia

The tall white-columned building is likely among these five cities. The Phrio website collects large buildings from around the world, allowing filtering by city and providing images. The page for Delft is as follows:

image

Delft page on Phrio. Image from NixIntel's blog, at the time of writing, this site was under maintenance.

Delft did not have any obvious matching buildings, as its architectural volume is generally not as large as in the advertisement photo. Utrecht has several larger commercial buildings, but still no matches. Rotterdam, Amsterdam, and The Hague are much larger, and the answer is likely among them. Large cities must have many tall buildings, and here are Rotterdam's buildings:

image

Overview of tall buildings in Rotterdam. Image source same as above

After browsing, a familiar building with prominent tall white columns stands out. It is called the Unilever Building:

image

Unilever Building. Image source same as above

Entering street view, the familiar black and white lamp post, tram tracks, and road surface confirm that the photo was taken here.

image

Street view of Rotterdam. Image source Google Earth

This case exemplifies the power of open-source information on the internet. Without using professional knowledge, we can extract a few information points and utilize the diverse resources of the internet to explore and arrive at the answer. This is the superpower that the internet era has granted each of us.

Natural Geographic Information#

Common natural geographic information includes light and shadow, weather, terrain, and vegetation. Extracting and interpreting natural geographic information requires a broad and deep accumulation of knowledge in natural geography, as well as intuition based on that foundation. In many famous Network Tracing cases, key steps are often just a statement from an expert saying, "I feel like this area," which is difficult to convey.

Common natural geographic information includes:

  • Terrain: Water bodies (rivers, lakes, reservoirs, oceans), mountains (snow cover), soil color, etc.
  • Vegetation: Plants usually have specific distribution areas; when the target range is unclear, plant information can assist in exclusion. However, due to the widespread introduction of species, this exclusion is not very reliable.
  • Light and Shadow: Shadows can provide a rough direction, helping to determine the direction of travel or road direction. The Suncalc website can help determine shadow length, location, or time. It is usually easy to tell whether it is day or night from the image, which helps exclude some options that do not match the day-night state of the image.
  • Weather: Weather is a common auxiliary piece of information. Based on historical weather changes in the location, the date range of the photo can be inferred.
  • People: This can be considered geographic information. The ethnicity of people in the image can help guess the location where the photo was taken.

This section uses a post from the Chao Fan community as an example. This question was solved collaboratively by two prominent users in the Chao Fan community, Anshan Wu Yanzu and Cat (hereinafter referred to as "Cat"). The puzzle image is as follows, asking for the name of the mountain range below the plane.

image

The third puzzle image. Image from Chao Fan Community

Anshan Wu Yanzu's judgment of this image was:

Based on the weather and mountain vegetation, it can be inferred that it should be north of Beijing (including the three northeastern provinces and parts of Inner Mongolia).

Based on the red-tiled roofs of the distant houses and the presence of corn crops in front, it can be basically determined to be in the northeastern region.

image

This judgment process is more based on experience, but the range of the northeastern region is still large. This is also a characteristic of inferring based on natural geographic information: it requires rich experiential knowledge but cannot narrow down to a very small area.

Cat further provided two points of judgment:

The railway on the left has streetlights and a station sign, suggesting that the photo was taken near a railway station.

The distant houses should be oriented north-south, and since the shadow of the northbound return line cannot be on the south side, the inferred orientation is as follows:

image

The left railway runs roughly north-south, while the railway crossing runs east-west, with the intersection located within 500 meters of the station.

At this point, all information in the image has been extracted. While it is feasible to manually search all railway crossings in the northeastern region, the time cost is too high, and it is easy to overlook. Is there a tool that can replace humans in this task? Yes! Introducing a groundbreaking search tool in the field of open-source investigation: Overpass Turbo. This is a web-based data mining tool based on OpenStreetMap. In simple terms, it is a map search engine that can search for all locations that meet specified conditions based on user-defined spatial relationships. In China, it has fewer points of interest, but railway-related information is relatively complete.

Don't get too excited too early; the following news may be daunting—using it requires learning code. Overpass Turbo uses a query language called Overpass API.

image

The core code used in this example is as follows, provided by Cat. I tried to introduce high-speed rail conditions to narrow the range but found that the maxspeed field was missing, so I used the original code here. Due to space limitations, only a brief explanation is provided; interested readers can search for tutorials to learn.

// Search for railway bridges longer than 1 kilometer within the area, stored in w1
way[railway = rail][bridge](if: length() > 1000)({{bbox}}) -> .w1;
// Search for non-bridge railways longer than 1 kilometer that intersect with w1 (distance = 0), stored in w2
way(around.w1: 0)[railway = rail][!bridge](if: length() > 1000) -> .w2;
// Provide all railway stations within 500 meters of w1 and 20 meters of w2
node(around.w1: 500)(around.w2: 20)[railway = station];

The northeastern region is large and can be searched in two or three rounds. The results are as follows, with circles indicating hits:

image

image

Overpass Turbo search results. Image from Chao Fan Community

Based on the previously analyzed railway direction, a station that meets the conditions can be filtered out: Tahuangqi Station.

image

image

Tahuangqi Station. Image from Chao Fan Community, Gaode Maps

This case does not rely solely on natural geographic information, but the judgment of the region significantly reduced the search workload. With Overpass Turbo, rapid large-scale screening becomes possible.

Off-site Information#

When the information in the image is insufficient to determine the location, detectives must obtain off-site hints. If any of the following involve privacy and legal issues, please ensure to use them with the consent of the questioner or the parties involved, or with authorization from official departments.

  • Image EXIF information: If the questioner publishes the original image and the online platform has not removed the EXIF information, this information can directly locate the shooting location.
  • Questioner's historical records: Check the content the questioner has posted on public social platforms, including personal homepages and comments. Some people may use the same avatar or username across different public social platforms, posting similar content, making it easy to cross-platform search.
  • Social network relationships: The questioner's friend network may also expose their identity. Friends they frequently interact with may share similar life experiences, interests, or belong to the same organization, and the content posted by friends may also be closely related to the questioner.

Will I Never Dare to Post Anything Online Again?#

Network Tracing often raises privacy concerns. To avoid public doubts, the Chao Fan community and the Twitter account @Quiztime mainly feature questions where the questioner publishes their own photos. However, there are inevitably some ill-intentioned individuals who secretly investigate others. Therefore, everyone should be cautious when posting content, assuming that all their images could potentially expose the shooting location.

  • Is the publishing platform a public platform? Before viewing the content I posted on that platform, do I need to add them as a friend or obtain their consent? Information that is accessible to everyone needs to be handled with great caution.
  • If the shooting location is known, will it involve core privacy? Showing places you have visited or public places does not have much impact; however, if the shooting location is related to your and your friends' residences or workplaces, you must ensure that the image does not contain information that can be investigated as described above, and the text does not involve descriptions of commuting or transportation.
  • Avoid posting images related to national security, such as weapons and military.

By paying attention to the above points, you will not end up like Wang Luodan, who had her home exposed.

If the image does not involve core privacy but you also do not want to be investigated for the shooting location, you should note:

  • Avoid posting multiple images of the same location, as this can provide ample information for open-source investigations.
  • Avoid posting images with a lot of textual information.
  • Avoid posting images with special infrastructure information and natural geographic information.
  • Avoid posting original images.

I believe that after reading this article, readers have understood the basic gameplay of Network Tracing and can analyze the important clues contained in a photo. Now, open your Weibo and Moments; you can also analyze which images may expose your location, thus becoming your own expert in online content security.

Coach, I Want to Learn#

Under the premise of adhering to privacy and security, Network Tracing is a beneficial puzzle game. It can expand players' knowledge, enhance understanding of reality and the internet, and train reasoning abilities and the ability to obtain information independently.

This article focuses on extracting image information, while online resources are only mentioned in passing. Because in my view, knowing what information can be searched is more important than how to search, and this is the biggest obstacle for most people participating in Network Tracing—failing to realize that key information exists within the image. Once this hurdle is overcome, you can use image search engines to obtain further information or filter through websites that specialize in introducing such information. If you do not know what websites to use, you can search or ask questions on dedicated forums; these are all issues that can be gradually accumulated through experience.

What forums can I use for communication? What influential blogs can I visit? What resources can help me? These are the Network Tracing questions left for you: I have provided many hints earlier, and now it is time to exercise your ability to obtain information independently.

I wish you a smooth journey in your online exploration!

References#

During the creation of this article, I referenced the following articles and would like to thank the original authors:

References#

  1. ^ Similar terms include open-source investigations (Open Source Investigations, OSI) and online open-source investigations (Online Open Source Investigations, OOSI).
  2. ^ The most feared are those with ulterior motives; a user on Renren used just two images and 40 minutes to infer the address of a Beijing celebrity https://page.om.qq.com/page/OzDezp5M825FCotpeuYPEl6w0
  3. ^https://m.weibo.cn/status/3886914195127757
  4. ^ I am EyeOpener's personal space - Bilibili https://space.bilibili.com/43645887/channel/seriesdetail?sid=90709
  5. ^ The personal space of "Exploration Address" - Bilibili https://space.bilibili.com/1960160215
  6. ^ The personal space of "Cosmic Encyclopedia" - Bilibili https://space.bilibili.com/93569847
  7. ^ The personal space of "Night Point Short Video" - Bilibili https://space.bilibili.com/1078123935
  8. ^ Verif!cation Quiz Bot (@quiztime) - Twitter https://twitter.com/quiztime
  9. ^ GeoConfirmed - War Ukraine https://www.google.com/maps/d/viewer?mid=10YK14-QB25penu8jeS4hBVarzGKZsVgj&ll=48.104096492535504%2C31.957569662788224&z=6
  10. ^ There are also similar groups on Douban; interested readers can visit the "Let's Play Network Tracing Group": https://www.douban.com/group/725884/
  11. ^ Crossing the Yangtze River, bridge construction, architectural photography - Hui Tu Network https://www.huitu.com/photo/show/20180218/204610197016.html
  12. ^ The first high-precision soil color map in China http://www.ssa.ac.cn/?p=7955
  13. ^ Based on the shadows in the image, find the specific location of the mountain (mountain range) under the plane https://www.bilibili.com/video/BV1LG4y1a79k
  14. ^ National standard GB 17733-2008 https://openstd.samr.gov.cn/bzgk/gb/newGbInfo?hcno=A4BC390727C25D327CF14ADE1C0F27A3
  15. ^ National standard GB 50180-2018 https://baigongbao.oss-cn-beijing.aliyuncs.com/2020/09/29/AGZeRrtGrN.pdf
  16. ^ Solving POI locations using open map data (domestic) https://invited-aquarius-173.notion.site/POI-f7b3c76127404e43ac4a462c40afcc1e
  17. ^ About - NixIntel https://nixintel.info/about/
Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.