banner
阿珏酱

阿珏酱

いつもとは逆の電車に乗り、見たこともない風景を見に行く
twitter
github
facebook
bilibili
zhihu
steam_profiles
youtube

I'm sorry, I'm back!

It was a dark night with no moon, windy and rainy, where one could not see their own hand in front of them, as if something was about to happen in the darkness.

Due to an "accident" and the pandemic, I became a "vegetable person" for a while and almost entered another dimension, but fortunately, I was "rescued" back. I haven't received any news from those looking for me in recent years, and I am grateful to those who still care about me, thank you.

The wrong one is not me, but this world that is destined in the dark.

In these years, I have no idea about the people and things looking for me; I haven't received messages on QQ or WeChat for a long time.
imageimage
image
Seeing my avatar, which hadn't been lit for many years, light up again, many friends came to ask me what happened.
I am very grateful to those online friends I have never met; perhaps calling you friends is more intimate. You have always remembered me, which deeply moves me.

It has been three years of searching for me.
image
image
image
image
Searching for me all over the world.
image

I am incredibly touched.
Now, let's get to the main content.

About the Blog#

The blog was first established on 2016-06-09, and it has been nearly 8 years now, surviving through various disasters.
During this time, I have met many little partners, discussing, progressing, and growing together. Therefore, I do not want the blog to disappear just like that.
image
The blog data was originally backed up automatically to Qiniu Cloud, but I only found out now that the backup stopped around October 2019 for some unknown reason.
They say the internet has memory; adhering to the spirit of "the head can be severed, blood can flow, but data cannot be lost," I searched for days in the vast network and finally found the archived version of my blog. You can click here to view the historical version of my blog archive (a famous foreign website archiving project).

So, I wrote a Python script to crawl the articles and comments.
After crawling, I found that someone had already backed me up (my blog was previously sponsored by someone for three years).
image
image

Blog Migration#

The blog has migrated to Blog Garden, which was actually unavoidable.

The original domain name of the blog (52ecy.cn, moeins.cn, moeins.com) was registered by someone after it expired, and after unsuccessful communication, it cannot be retrieved for the time being.
With no other choice, I could only temporarily migrate the blog to Blog Garden (whether to self-host again will be considered later). On the other hand, I may not have much free time now to tinker with my own system, so placing it on Blog Garden is actually more worry-free. However, commenting is not as convenient; you need to log in to comment. Of course, if you have any questions, you can directly @ me in the group.

Actually, as early as the end of 2018, I already had the idea of changing the theme -> This time I really procrastinated (I can really procrastinate; I admire myself).
Unfortunately, the previous emlog system did not have a suitable ready-made theme, and I was too lazy to migrate, so I kept delaying.

I have always wanted to change that theme because it was not very good-looking and lacked personality; it was just plain and did not match my personality and style.
However, while my aesthetic is particularly great, it is a bit difficult for me to write a theme that matches my aesthetic.

Later, I suddenly saw the blog of Blog Garden user 不忘编码 and thought that Blog Garden supports such self-customization.
With the thought, "I don't want his blog to be lost and inaccessible," I decided to temporarily migrate to Blog Garden.
img

I have always wanted to change to this kind of anime-style theme.
Currently, this blog's beautification style is based on the WordPress theme Sakura from 樱花庄的白猫, migrated by 不忘编码,but there are many bugs and detail issues. I spent two more days optimizing it, but many pages are still not optimized; I will take it slow from now on ~~ (If something can be done tomorrow, why leave it for tomorrow's me to do?) ~~

Due to the previous emlog blog using an old version of the HTML TinyMCE editor, the generated HTML code of the articles is quite messy. Therefore, when migrating to Blog Garden's markdown format, many style compatibility issues arose. I have already tried my best to fix some, but I cannot guarantee that all articles will display correctly; I will fix them gradually as I see them.

Blog Garden does not support one-click migration for self-hosted blog systems, so the publication time and comment information of the articles cannot be migrated together. However, I want to retain the original flavor, so I wrote them into the articles. Private comments are still hidden.
(I have also communicated with the Blog Garden team about the issues with self-hosted blog systems, and they said they would add it in future development. I won't make it difficult for them; they are also struggling. The article publication used the Cnblog's VScode plugin.)

I also created a separate page for friendly links to migrate, but many of the linked websites are no longer accessible, and some have already removed the links.
Now I am no longer on an independent domain, and I feel embarrassed to apply for friendly links...

All the images from the previous blog were stored on Sina, and I have always worried that they might be lost one day. So, I usually kept one copy on Qiniu Cloud and one on Sina. Now they have all been migrated to Blog Garden.

Why Do I Write a Blog?#

Some people might wonder if this is so important?
Sometimes I also want to say something, want to write something, but there is no one to talk to, no place to write; I just need a place like that.
I don't write a blog for traffic or making money; I just want to create a little space of my own, to mess around in my little circle.
Perhaps it is precisely because of this passion that I have been able to persist slowly, but unfortunately, things did not go as planned, and some accidents occurred.

Many friends in the blog circle have either disappeared or stopped updating, which is really a pity; I haven't had the chance to get to know them ~~ (especially that 月宅 guy) ~~

@寒穹 that guy even said to me: "阿珏,how come you are still so 二刺螈 after all this time."
I think the only thing that has not changed over the years is me.

I am very happy and grateful that so many people still remember me and treat me so well.
img

It seems I shouldn't post so many pictures.
In memory of the lost blog, may it forever live in my 127.
img

Python Code#

The Python code for migration, although not very useful, is recorded here for those who need to learn.

Click to view the code
import os
import re
import time
import requests
from bs4 import BeautifulSoup
from urllib.parse import urlparse, parse_qs

# Specify file path
file_path = "C:\\Users\\Administrator\\Desktop\\blog\\content.txt"
save_folder = "C:\\Users\\Administrator\\Desktop\\blog\\content\\"

def save_to_file(data, file_name):
    try:
        file_path = os.path.join(save_folder, file_name + ".txt")
        with open(file_path, 'a', encoding='utf-8') as file:
            file.write(data)
        print("Data has been successfully saved to file:", file_path)
    except Exception as e:
        print("Error saving file:", e)


def remove_html_tags(text):
    soup = BeautifulSoup(text, 'html.parser')
    return soup.get_text()


def comment(html_content):
    comment_matches = re.findall(r'<div class="comment (.*?)" id="comment-\d+">[\s\S]*?<img .*?inal=".*?202.*?/([^"]+)"/>[\s\S]*?<div ' +
                                 'class="comment-content">(.*?)</div>[\s\S]*?itle=".*?">(.*?)</span>[\s\S]*?<span class="comment-time">(.*?)</span>',
                                 html_content, re.DOTALL)
    article_comments = ''
    if comment_matches:
        i = 0
        for comment_match in comment_matches:
            if 'comment-children' in comment_match[0]:
                i += 1
                is_reply_comment = '>' * i
            else:
                is_reply_comment = '>'
                i = 1

            # Avatar size controlled at 40
            # Compatible with gravatar avatar https://secure.gravatar.com/avatar/ 
            if 'gravatar.com' in comment_match[1]:
                avatar_url = '![](' + str(re.sub(r'(\?|&)s=\d+', '\\1s=40', str(comment_match[1]))) + ')    ' 
            else: 
                parsed_url = urlparse(comment_match[1])
                query_params = parse_qs(parsed_url.query)
                dst_uin = query_params.get('dst_uin', ['1638211921'])
                avatar_url = '![]('+'https://q1.qlogo.cn/g?b=qq&nk='+str(dst_uin[0])+'&s=40'+')    '   

            comment_content = comment_match[2].strip()  
            nickname = comment_match[3].strip()  
            comment_time = comment_match[4].strip()  
            link_url = re.search(r'030.*?/(.*?)" .*? rel', nickname)  

            # Construct the markdown format for comments
            comment_content = is_reply_comment + comment_content.replace('\n', '>')
            comment_content = comment_content.replace('##This comment is private##', '[#This comment is private#]')

            # Replace emoji images
            soup = BeautifulSoup(comment_content, 'html.parser')
            for img in soup.find_all('img'):
                title_text = img.get('title', '')
                img.replace_with('[#'+title_text+']')

            comment_content = soup.get_text()

            # Save the URL address of the comment user
            if link_url:
                nickname = '['+remove_html_tags(nickname)+']'
                link_url = '(' + link_url[1] + ')   '
            else:
                link_url = ''
                nickname = remove_html_tags(nickname) + '   '

            if i == 1:
                article_comments += '\n'

            article_comments += is_reply_comment + avatar_url + nickname + link_url + comment_time + '\n' + comment_content + '\n'

        return article_comments
    else:
        return ''


def process_article(url):
    print("Currently executing===="+url)
    response = requests.get(url)

    if response.status_code == 200:
        html_content = response.text
        soup = BeautifulSoup(html_content, 'html.parser')
        article_title = soup.find('h1', class_='article-title') 
        article_mate = soup.find('div', class_='article-meta') 
        article_article = soup.find('article', class_='article-content') 

        soup_content = BeautifulSoup(article_article.prettify(), 'html.parser')
        img_tags = soup_content.find_all('img')
        pattern = r"https://web.*?_/"
        
        for img_tag in img_tags:
            if 'data-original' in img_tag.attrs:
                original_url = img_tag['data-original']
            else:
                original_url = img_tag['src']

            cleaned_url = re.sub(pattern, '', original_url)
            new_url = 'https://image.baidu.com/search/down?url=' + cleaned_url
            img_tag['src'] = new_url
            del img_tag['data-original']

        article_comment = soup.find('div', class_='article_comment_list') 
        data = "###### `When you see this prompt, it means the current article has been migrated from the original emlog blog system. The publication time of the article is too long ago, and the arrangement and content may not be complete. Please understand.`\n\n" + '###' + article_title.text.strip()+'\n\n'+article_mate.text.strip().replace('\n', '').replace('\r', '').replace('\t', '')+'\n' + soup_content.prettify().replace('<article class="article-content">', '').replace('</article>', '')

        save_to_file(data + '\nUser Comments:\n\n', article_title.text.strip())

        data = comment(html_content)

        if not data:
            return

        save_to_file(data, article_title.text.strip())

        if article_comment:
            comment_links = re.findall(r'<a\s+href="(.*?)nts"', str(article_comment))

            if comment_links:
                print('There are paginated comment data')
                for link in comment_links:
                    url = link +"nts"
                    print(url)
                    response = requests.get(url)

                    if response.status_code == 200:
                        html_content = response.text
                        data = comment(html_content)
                        if not data:
                            return
                        save_to_file(data, article_title.text.strip())
                        print("Writing paginated comment data")
    else:
        print("Failed to retrieve the webpage.")


def main():
    with open(file_path, 'r', encoding='utf-8') as file:
        for line in file:
            segments = line.strip().split("----")

            if len(segments) > 0:
                url = segments[0]  
                process_article(url)
            else:
                print("No URL found in the line.")
            print('Starting the next article')
            time.sleep(4)

if __name__ == "__main__":
    main()

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.