Using PHP to read from RSS and post to Bluesky

I am currently using a version of the following PHP code to automate the posting of content from World War II Database to the Bluesky social network. First, please note that you will need the following Bluesky information.

  • Bluesky Username – Likely the email address you use to log in to Bluesky
  • Password
  • Repository Name – Likely your Bluesky name followed by “bsky.social”; for example, “ww2db.bsky.social”

There are two more things left to configure.

  • URL to the RSS Feed
  • File Path/Name for a Log File – This is used to avoid posting duplicate content on Bluesky

Below is the code. Note that in Step 3 there is a hard-coded value of “100” to note that the log file (to keep track of which RSS content has already been posted) will keep track of the most recent 100 records; in Step 4 there is some code to handle hashtags and URLs that may be present in the RSS content; Step 5 contains the code to obtain an access token which will be used for the actual posting; finally, Step 6 posts in the format of title-space-link, which you may wish to adjust based on your RSS feed structure.

$username = 'username_here';
$password = 'password_here';
$repoName = 'repo_name_here';
$rssUrl = 'rss_url_here';
$postedIdsFile = 'rss_to_bksy_posted_ids.txt';

// Change the following to "N" to output some debugging information
$silentMode = "Y";

// Step 1: Establish a function to get Bluesky access token
function getAccessToken($username, $password) {
    $url = 'https://bsky.social/xrpc/com.atproto.server.createSession';
    $data = [
        'identifier' => $username,
        'password' => $password
    ];

    $options = [
        CURLOPT_URL => $url,
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_POST => true,
        CURLOPT_POSTFIELDS => json_encode($data),
        CURLOPT_HTTPHEADER => [
            "Content-Type: application/json"
        ]
    ];

    $ch = curl_init();
    curl_setopt_array($ch, $options);
    $response = curl_exec($ch);
    curl_close($ch);

    $response_data = json_decode($response, true);
    return $response_data['accessJwt'];
}

// Step 2: Establish a function to fetch content from a RSS feed
function fetchRss($rssUrl) {
    $rss = simplexml_load_file($rssUrl);
    return $rss;
}

// Step 3: Establish a function to check if post ID exists in the log file, and to update the log file
function checkAndUpdatePostIds($postId, $file) {
    $postedIds = file_exists($file) ? file($file, FILE_IGNORE_NEW_LINES) : [];

    if (in_array($postId, $postedIds)) {
        return false;
    }

    array_push($postedIds, $postId);
    if (count($postedIds) > 100) {
        array_shift($postedIds);
    }
    file_put_contents($file, implode("\n", $postedIds) . "\n");
    return true;
}

// Step 4: Establish a function to handle Bluesky facets for hashtags and URLs
function extractFacets($content) {
    preg_match_all('/#(\w+)/', $content, $hashtags);
    preg_match_all('/https?:\/\/[^\s]+/', $content, $urls);

    $facets = [];

    foreach ($hashtags[0] as $hashtag) {
        $startPos = strpos($content, $hashtag);
        $facets[] = [
            'index' => [
                'byteStart' => $startPos,
                'byteEnd' => $startPos + strlen($hashtag)
            ],
            'features' => [
                [
                    '$type' => 'app.bsky.richtext.facet#tag',
                    'tag' => $hashtag
                ]
            ]
        ];
    }

    foreach ($urls[0] as $url) {
        $startPos = strpos($content, $url);
        $facets[] = [
            'index' => [
                'byteStart' => $startPos,
                'byteEnd' => $startPos + strlen($url)
            ],
            'features' => [
                [
                    '$type' => 'app.bsky.richtext.facet#link',
                    'uri' => $url
                ]
            ]
        ];
    }

    return $facets;
}

// Step 5: Establish a function to post to Bluesky
function postToBluesky($accessJwt, $repoName, $content) {
    $facets = extractFacets($content);

    $url = 'https://bsky.social/xrpc/com.atproto.repo.createRecord';
    $data = [
        'repo' => $repoName,
        'collection' => 'app.bsky.feed.post',
        'record' => [
            'text' => $content,
            'createdAt' => date('c'),
            'facets' => $facets
        ]
    ];

    $options = [
        CURLOPT_URL => $url,
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_POST => true,
        CURLOPT_POSTFIELDS => json_encode($data),
        CURLOPT_HTTPHEADER => [
            "Content-Type: application/json",
            "Authorization: Bearer $accessJwt"
        ]
    ];

    $ch = curl_init();
    curl_setopt_array($ch, $options);
    $response = curl_exec($ch);
    curl_close($ch);

    return $response;
}

// Step 6: Put everything together

$accessJwt = getAccessToken($username, $password);
$rss = fetchRss($rssUrl);

foreach ($rss->channel->item as $item) {
    $postId = (string) $item->guid;
    $content = (string) $item->title . " " . (string) $item->link;

    if (checkAndUpdatePostIds($postId, $postedIdsFile)) {
        if ($silentMode == "N") {
			echo "Posting: $content\n";
		}
        $response = postToBluesky($accessJwt, $repoName, $content);
		if ($silentMode == "N") {
			echo $response . "\n";
		}
    }
	else {
		if ($silentMode == "N") {
			echo "Duplicate post detected, skipping: $content\n";
		}
    }
}

As far as usage goes, you can refactor the various pieces to fit into your existing PHP-based management tool. As a shortcut, you can also take the above code as-is and run it via cron or other similar job schedulers.

My implementation of this code posts contents to the WW2DB Bluesky page at the URL https://bsky.app/profile/ww2db.bsky.social.

Leave a Reply

Your email address will not be published. Required fields are marked *