Skip to content
  • Is this url fixed: https://abs.twimg.com/responsive-web/client-web/main.90f9e505.js?

    What's the last 90f9e505 part? Do I need to update it frequently? Is there a way to get it programmatically once it changes?

  • Author Owner

    This is the url of the compiled main app of twitter, so it should be updated regularly (probably some weeks). You can get it from the browser. If you have a way to get it automatically, it can be prepended to the script.

    • Is there a way to change it to parse specific tweet?

      it looks like this snippet is returning data of $count number of tweets from predefined $user account

      $user = 'twitterdev';
      $count = 100;

      is there any way to modify it into just getting data of only specific tweet?
      cuz I'm finding way to just get the image out of predefined tweet link, not parsing the account itself

      for example,
      make it load https://twitter.com/TwitterDev/status/917867461688541184
      and make it return array of the link to the images in the tweet (media_url_https)


      Disclosure I came from stackoverflow (specifically this question: https://stackoverflow.com/questions/65403350/how-can-i-scrape-twitter-now-that-they-require-javascript) and I don't have enough reputation to comment there

      I'm trying to find some way to get image from specific tweet link without dev account in PHP to be used in webhook to Discord, because IFTTT doesn't include image link as its recipe (IFTTT only give CreatedAt, TweetEmbedCode, LinkToTweet, UserName, and Text)

      Instead of making IFTTT sending LinkToTweet directly as webhook to my Discord server's channel and let Discord render the embed (which don't always work), I planned to make IFTTT send LinkToTweet to my server and then let the server parse the link to get image link (alongside other properties), and put that link and other details in pre-formatted embed to send out as webhook to my Discord server's channel

      pre-formatted embed sent from webhook always work correctly, unlike the one generated automatically from twitter link after the fact where it could fail randomly

    • Update

      I found this Japanese article which seem to show some more details
      https://note.com/kohnoselami/n/nb8ef0eea5831

      I'm able to modify your code to suit my need by changing this bit

      $curl = curl_init("https://twitter.com/i/api/2/timeline/profile/$userId.json?" . http_build_query($query, '', '&', PHP_QUERY_RFC3986));

      into something like this

      $curl = curl_init("https://twitter.com/i/api/2/timeline/conversation/$tweetId.json" . http_build_query($query, '', '&', PHP_QUERY_RFC3986));

      and adjusting other setting a little bit to focus on tweet instead of profile

    • and adjusting other setting a little bit to focus on tweet instead of profile

      Can you go into more detail on what other settings that you changed? I'd love to just be able to just focus on getting/updating tweet information instead of constantly updating user profile information to get all tweets.

    • technically, majority of the lines were kept as is... I did modify for my needs, like using webhooks instead of echo in some places, and I moved them into separate functions for readability purposes

      line 10 to 40 are for getting the authorization that's needed for all other parses

      also since I could reuse line 41 to 61 to get details of additional user when needed, that went into its own function as well... since the tweet object only return basic details of mentioned user or RT for example, I reuse it to get more details (like the profile picture)

      so instead of just writing this (line 60)

      $userId = $page_json['data']['user']['rest_id'];

      I actually write it into

      $user_data       = $page_json['data']['user'];
      $user_id         = $user_data['rest_id'];
      $display_name    = $user_data['legacy']['name'];
      $screen_name     = $user_data['legacy']['screen_name'];
      $profile_picture = str_replace("_normal", "", $user_data['legacy']['profile_image_url_https']);

      since it's in 'legacy', this might break in the future, but for now it still works for me

      but note that those were not relevant to the actual modification of the functionality of the code


      the actual code modification is very little:

      1. line 63 to 90, I just need this particular query for my use so I removed the others
      $query = [
                  'tweet_mode' => 'extended',
      ];
      1. line 91, I changed this
      $curl = curl_init("https://twitter.com/i/api/2/timeline/profile/$userId.json?" . http_build_query($query, '', '&', PHP_QUERY_RFC3986));

      into this

      $curl = curl_init("https://twitter.com/i/api/2/timeline/conversation/$tweetId.json?" . http_build_query($query, '', '&', PHP_QUERY_RFC3986));

      in which this was already written earlier as well, it's in the message you're replying to

      in my case, I already have the $tweetId to use since I'm parsing individual tweet from a URL, the id already exist in the url since tweets use the URL format http://twitter.com/<screen_name>/status/<tweetId>

      by the end of line 106

      $tweets = $json_page['globalObjects']['tweets'] ?? [];

      I would have the full tweet object that fetched earlier, so I can use that for further processing

      it's something like this, but it varies from tweet to tweet... this example is for normal tweet with a self-reply and other people replying to same tweet

      {
          "<tweet id>": {
              "created_at": "<timestamp>",
              "id_str": "<tweet id>",
              "full_text": "<full tweet>",
              "display_text_range": [
                  <integer>,
                  <integer>
              ],
              "entities": [],
              "source": "<where tweet was written on, eg. TweetDeck>",
              "user_id_str": "<user id of person who wrote the tweet>",
              "retweet_count": <integer>,
              "favorite_count": <integer>,
              "conversation_id_str": "<tweet id>",
              "lang": "<detected language code, wrong most of the time in my case so don't depend on this>",
              "self_thread": {
                  "id_str": "<thread's tweet id>"
              }
          },
          "<tweet id of a self-reply>": {
              "created_at": "<timestamp>",
              "id_str": "<tweet id of a self-reply>",
              "full_text": "<content of replied tweet>",
              "display_text_range": [
                  <integer>,
                  <integer>
              ],
              "entities": [],
              "source": "<where tweet is written on>",
              "in_reply_to_status_id_str": "<id of tweet being replied to>",
              "in_reply_to_user_id_str": "<id of user of the tweet being replied to>",
              "in_reply_to_screen_name": "<username of who wrote the tweet that's being replied to>",
              "user_id_str": "<user id of the person who replied>",
              "retweet_count": <integer>,
              "favorite_count": <integer>,
              "conversation_id_str": "<original tweet's id>",
              "lang": "<detected language code>",
              "self_thread": {
                  "id_str": "<thread's tweet id>"
              }
          },
          "<tweet id of other's reply>": {
              "created_at": "<timestamp>",
              "id_str": "<tweet id of other's reply>",
              "full_text": "@<replied-to username> <replied text>",
              "display_text_range": [
                  <integer>,
                  <integer>
              ],
              "entities": {
                  "user_mentions": [
                      {
                          "screen_name": "<username>",
                          "name": "<the long name>",
                          "id_str": "<user id>",
                          "indices": [
                              <integer>,
                              <integer>
                          ]
                      }
                  ]
              },
              "source": "<where this particular reply is written on>",
              "in_reply_to_status_id_str": "<id of tweet being replied to>",
              "in_reply_to_user_id_str": "<id of user of the tweet being replied to>",
              "in_reply_to_screen_name": "<username of who wrote the tweet that's being replied to>",
              "user_id_str": "<user id of the person who replied>",
              "retweet_count": <integer>,
              "favorite_count": <integer>,
              "conversation_id_str": "<thread's tweet id>",
              "lang": "<detected language code>"
          }
      }

      the display_text_range basically referring to at what character will the text displayed, because retweets usually had RT @<username> before actual text, and replies had a bunch of @s before actual text, and these were retained in the API as part of full tweet text, but only the actual tweet text are displayed to end user without the RT @<username> or the @s attached

      I can't provide more examples of what the twitter object would be as it will be too long, experiment with that yourself

    • You are awesome! This is enough to get me on the right track. Thank you!

    • Please register or sign in to reply
  • NumairAwan @NumairAwan ·

    Its not fetching the tweets anymore, any update?

0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment